diff --git "a/logs/main_log.txt" "b/logs/main_log.txt" deleted file mode 100644--- "a/logs/main_log.txt" +++ /dev/null @@ -1,122564 +0,0 @@ -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb --------------------------------------------------............. -[NO] DeepSpeed C++/CUDA extension op report....... - --------------------------------------------------[OKAY] - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -ninja .................. [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... ....... [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -ninja .................. [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY] -utils .................. [YES] ......quantizer [OKAY].............. - [NO] ....... quantizer[OKAY] -.............. [NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.transformer_inference .. -[NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY]async_io -fused_lamb ............. [NO] ....... [OKAY] - ............... [NO] quantizer....... ..............[NO] -[NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ...............torch version .................... 1.8.1 -torch cuda version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -............... 11.1 -torch versionnvcc version ......................................... 1.8.111.2 - -deepspeed install pathtorch cuda version .......................... 11.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -nvcc versiondeepspeed info ........................................ 11.2 -0.4.2+bc17042, bc17042, big-sciencedeepspeed install path - deepspeed wheel compiled w............ ...... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch 1.8, cuda 11.1 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1DeepSpeed general environment info: - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY] -....... [OKAY] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.1 -11.1 -nvcc version nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch versiontorch install path .................... ...............1.8.1 -torch cuda version ............... 11.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -nvcc version .....................torch version 11.2.................... - deepspeed install path1.8.1 -........... torch cuda version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']............... - deepspeed info11.1 -...................nvcc version 0.4.2+bc17042, bc17042, big-science..................... - deepspeed wheel compiled w.11.2 -......deepspeed install path torch 1.8, cuda 11.1........... - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... utils[NO] ......................... [YES][NO] -...... [OKAY] -quantizer .............. [NO] .......transformer_inference [OKAY].. - [NO] .......-------------------------------------------------- -[OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -fused_adam ............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -fused_lamb ............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -stochastic_transformer . [NO] ....... [OKAY] -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... torch version1.8.1 -.................... torch cuda version1.8.1 -............... 11.1torch cuda version - nvcc version............... .....................11.1 - 11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info - deepspeed wheel compiled w.................... ......0.4.2+bc17042, bc17042, big-science -torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version ....................torch cuda version 1.8.1............... - 11.1torch cuda version - nvcc version............... .....................11.1 -11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -........... deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed info - ...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - torch 1.8, cuda 11.1deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch install path - ...............torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch cuda version - ...............torch version 11.1.................... - nvcc version1.8.1 -..................... 11.2torch cuda version - ...............deepspeed install path ...........11.1 -nvcc version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -..................... deepspeed info11.2 -...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...... deepspeed infotorch 1.8, cuda 11.1 -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1DeepSpeed general environment info: - -torch cuda versiontorch cuda version - .............................. 11.111.1 - -torch install pathnvcc versionnvcc version ......................................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infotorch version deepspeed info .................... ................... ................... 1.8.1 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.torch cuda versiondeepspeed wheel compiled w. ........................... torch 1.8, cuda 11.111.1torch 1.8, cuda 11.1 - - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attnninja ............ [NO].................. .......[OKAY] -[OKAY] --------------------------------------------------- -op nametransformer ............................ installed[NO] ......... compatible -[OKAY] --------------------------------------------------- -stochastic_transformer . [NO] cpu_adam....... ...............[OKAY] -[YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] --------------------------------------------------....... -[OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] transformer_inference....... ..[NO] -[NO] ....... [OKAY] -utils ..................transformer_inference [YES].. ......[NO] [OKAY]....... - [OKAY] -quantizer .............. [NO] .......utils [OKAY].................. - [YES] ...... --------------------------------------------------[OKAY] - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:torch version .................... 1.8.1 - -torch cuda version ...............torch install path 11.1............... - nvcc version ..................... 11.2 -deepspeed install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -........... torch version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - ....................deepspeed info 1.8.1................... - 0.4.2+bc17042, bc17042, big-science -torch cuda version deepspeed wheel compiled w................ ......11.1 -torch 1.8, cuda 11.1nvcc version - ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... ..................[OKAY] .................. -ninjaninjaninja ninja.................................... .................. [OKAY][OKAY] .................. - -[OKAY][OKAY] -------------------------------------------------- - - [OKAY][OKAY]-------------------------------------------------- - --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - -[OKAY] ----------------------------------------------------------------------------------------------------- - -op name -op name-------------------------------------------------- op name - op nameop nameop name................ ................................................installed installedinstalled..installed ....compatible.. - compatiblecompatiblecompatible-------------------------------------------------- - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -................................ ................op nameinstalledinstalled ..installed.. ................ compatible..compatible - - installed--------------------------------------------------compatible-------------------------------------------------- - -cpu_adam ............... cpu_adamcpu_adam [YES]cpu_adam ............... ............... ..................... [YES] [YES] [YES] [OKAY]...... ...... - ...... [OKAY] [OKAY] -[OKAY] - -op nameop name op nameop name................................ ................installedinstalled................ installed..installed.. ....compatiblecompatible - -compatiblecompatible-------------------------------------------------- --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - - -..-------------------------------------------------- -compatible --------------------------------------------------- -fused_adam .............fused_adam fused_adam [NO] ............. fused_adam............. ....... [NO] [NO].............[OKAY] -..............[NO] [OKAY][OKAY].......fused_lamb - -cpu_adamcpu_adam cpu_adamcpu_adam ............... ............... ..............................[YES] [YES] [YES] [YES]...... ..................[OKAY] -cpu_adamcpu_adam ...............cpu_adam............... [YES]cpu_adam............... [YES] [YES]............ ............... ...... [OKAY][OKAY] -[YES][OKAY] - [OKAY]............. - [OKAY][OKAY][OKAY] - - - -...... [OKAY] - fused_lambfused_lamb[NO] ................................. fused_lamb [NO][NO] [OKAY]........................... - [NO][OKAY][OKAY] - -fused_adamfused_adam fused_adamfused_adam ............. ............. .......................... [NO] [NO] [NO] ....... [NO]....... ....... [OKAY] -....... [OKAY] -[OKAY][OKAY]....... - - fused_lamb[OKAY] fused_lamb -fused_adamfused_adam fused_adam.......................... fused_adam[NO] [NO]............. .................... ....... [NO][OKAY] [NO] -sparse_attn ............sparse_attn sparse_attn [NO] ........................sparse_attn....... ............[NO][NO][OKAY] -[NO].............. .......[OKAY][OKAY] - -transformer[OKAY] -.............fused_lamb [NO]..........................fused_lamb ....... [NO][NO] ............. ....... [NO][OKAY] -....... [OKAY][OKAY]....... - -.......[OKAY] fused_lamb -[OKAY]....... -fused_lamb............. [OKAY]............. -............transformer transformer............transformer[NO] ............[NO]................... [NO].......[NO][OKAY] - [OKAY] -fused_lamb[NO] .......[NO]............. [OKAY][NO] -.......[OKAY]....... - [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY]sparse_attn -fused_lamb....... .................... [OKAY][OKAY] - -stochastic_transformer stochastic_transformer. stochastic_transformer stochastic_transformer [NO]. ........[NO] . [NO] [OKAY][NO]....... ....... -.......[OKAY] -[OKAY][OKAY] - - sparse_attn............sparse_attn transformer ............[NO]........................ .......[NO][NO] [NO] [OKAY]..................... -[NO] .......sparse_attn [OKAY]............ - [NO] .......sparse_attn [OKAY] - [OKAY][OKAY]transformer[OKAY] - - -sparse_attn............ transformer ............ [NO]............ [NO]sparse_attn .......[NO] ....... ............ [OKAY] ....... -[OKAY][NO] -............ transformer[NO] stochastic_transformertransformer................... [OKAY][NO]............. - [OKAY]transformer -....... ............transformer [OKAY][NO]............ stochastic_transformer - ....... [NO] [NO] [OKAY]stochastic_transformer -....... ....... [OKAY][OKAY]. - -....... [NO] [OKAY] transformer -stochastic_transformer [NO] .stochastic_transformer....... [NO][OKAY] . -....... [NO][OKAY] -. ....... ............[NO] stochastic_transformer [OKAY][NO] ....... -.[OKAY]....... - ....... [OKAY] -[NO] stochastic_transformer [OKAY] ....... - .[OKAY] -[NO]stochastic_transformer ....... [OKAY]. - [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop name op name................op name ................................installed................ ..installed installedinstalledcompatible -.... ..-------------------------------------------------- compatible - -compatiblecompatible-------------------------------------------------- - - --------------------------------------------------- --------------------------------------------------- -cpu_adam cpu_adam............... cpu_adam...............cpu_adam[YES] [YES] ............... ............... ...... [YES][OKAY]...... -[YES]......[OKAY] - ...... [OKAY][OKAY] - -fused_adam ............. [NO] fused_adam....... .............[OKAY]fused_adam -fused_adam [NO] ............. fused_lamb ....... ............. .............[NO] [OKAY] [NO] -[NO]....... fused_lamb..............[OKAY] .............[OKAY] -[OKAY] -[NO] - fused_lamb....... .............fused_lamb [OKAY] [NO] - .................... sparse_attn [NO] [OKAY] -................... [NO][OKAY] -.......sparse_attn [OKAY]............ - [NO] ....... sparse_attntransformer[OKAY] - ........................transformer sparse_attn [NO] [NO]............................... [NO].......[OKAY] [NO] -.......[OKAY] -[OKAY].......stochastic_transformer -transformer [OKAY]............ -.stochastic_transformer [NO]transformer.[NO] ..........................[NO] [OKAY][OKAY][NO] -....... - [OKAY]....... - stochastic_transformer[OKAY] -. stochastic_transformer[NO] ....... .[OKAY] -[NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report-------------------------------------------------- -JIT compiled ops requires ninja - --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop name op name ................ ................ ................................ installed installed installed ..installed .. .. ..compatible compatible compatible - - -compatible------------------------------------------------------------------------------------------------------------------------------------------------------ - - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam ...............cpu_adam ..............................[YES] ............... [YES] ...... [YES][YES] ...... [OKAY]......[OKAY] - -...... [OKAY][OKAY] - -fused_adam fused_adam............. fused_adam[NO]fused_adam............. ....... .......................... [NO] [OKAY] [NO][NO] -....... .......fused_lamb[OKAY] ....... ............. - [OKAY] [OKAY] -[NO] -fused_lamb fused_lamb....... fused_lamb ............. .............[OKAY]............. - [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] - - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attntransformer sparse_attn ............sparse_attn ............ ............ [NO] [NO]............ .......[NO] .......[NO] ....... [OKAY][OKAY] ....... -[OKAY] - -[OKAY]stochastic_transformer - transformer.transformertransformer [NO]............ ............ [NO]............ ....... [NO] [NO]....... [OKAY] ....... -....... [OKAY] [OKAY] -[OKAY] - -stochastic_transformerstochastic_transformerstochastic_transformer ... [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] - - ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................................... ..................[OKAY] - [OKAY][OKAY][OKAY] --------------------------------------------------- - - --------------------------------------------------- ---------------------------------------------------op name--------------------------------------------------op name - - ................op name................op name ................installed................installed ..installedinstalled.. compatiblecompatible.. -.. - -------------------------------------------------- --------------------------------------------------compatible -compatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... [YES]............... cpu_adamcpu_adam [YES]..................... [OKAY] ..................... [OKAY] - -[YES][YES] ............ [OKAY][OKAY] - -fused_adam fused_adam............. .............[NO] [NO]....... fused_adamfused_adam....... [OKAY] ............. -.............[OKAY] -[NO] fused_lamb [NO] fused_lamb.................... ....................[NO][OKAY] ....... - [NO][OKAY][OKAY] fused_lamb - -....... fused_lamb .............[OKAY] -.............[NO] [NO]....... .......[OKAY] -[OKAY]sparse_attn - ............ [NO] ....... sparse_attn[OKAY] -............ [NO]transformer ................... sparse_attnsparse_attn[OKAY] - [NO] ............transformer ............ .......[NO] ............ [OKAY][NO] ....... - [NO] [OKAY] .......stochastic_transformer - ....... [OKAY] transformer.[OKAY] - -............[NO]transformer stochastic_transformer[NO] ....... ............ ........ [OKAY] [NO][OKAY] -[NO] - .............. stochastic_transformer[OKAY][OKAY] - -.stochastic_transformer [NO] ........ [NO][OKAY] -....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninja--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninja .................. .................. ninja ..................[OKAY][OKAY] - -[OKAY]..................---------------------------------------------------------------------------------------------------- - -[OKAY] --------------------------------------------------- -op nameop name - -------------------------------------------------- op name................ -................ op nameinstalled................installed ................installed.. .. installed compatiblecompatible.. -.. - --------------------------------------------------compatible--------------------------------------------------compatible - - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam ..............................cpu_adamcpu_adam [YES]...............[YES] ......[YES]............... ...... [OKAY] ...... -[YES] [OKAY] [OKAY] -...... - [OKAY] -fused_adam ............. fused_adamfused_adam[NO] fused_adam....... ............. ............. [OKAY] .............[NO][NO] - .......[NO].......fused_lamb [OKAY].................... - [OKAY] [NO] -[OKAY].......fused_lamb - [OKAY].............fused_lamb -fused_lamb [NO].......................... .......[NO][NO] [OKAY].............. -sparse_attn [OKAY][OKAY]............ - - [NO] ....... [OKAY] -sparse_attntransformer ........................ [NO]sparse_attn[NO] sparse_attn ................... ................... [OKAY] [NO][NO] -[OKAY] - .............. stochastic_transformer [OKAY] transformer[OKAY] - ............. - [NO]transformer[NO] ....... transformer............ ....... [OKAY][OKAY]............ -[NO] - [NO]....... stochastic_transformer.......[OKAY] -[OKAY]. - stochastic_transformer[NO] stochastic_transformer........ [OKAY][NO] -. .......[NO] [OKAY]....... - [OKAY] -ninjaninjaninja ninja .................................... .................. ..................[OKAY] [OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name ................ ................................installed ................installed..installed installed ..compatible -compatible.... --------------------------------------------------- ---------------------------------------------------compatiblecompatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] ...............cpu_adam......cpu_adam ...............[YES][OKAY]............... - ...... [YES] [YES][OKAY] - ............ [OKAY][OKAY] - -fused_adam ............. [NO] ....... [OKAY]fused_adam - fused_adam.............fused_adam fused_lamb ............. [NO] ............. .............[NO] ....... [NO] [NO] .......[OKAY].............. - [OKAY][OKAY][OKAY] - - -fused_lamb fused_lamb.............fused_lamb ............. [NO]............. [NO].......[NO] sparse_attn[OKAY]....... -....... ............ [OKAY] [OKAY] -[NO] - ....... [OKAY] -transformer ............ sparse_attn[NO] ................... sparse_attn[OKAY] -[NO]sparse_attn ............................... stochastic_transformer[OKAY] -[NO][NO] ...............transformer [NO][OKAY]............ [OKAY]....... - - [NO][OKAY]transformer transformer -................... ............[OKAY][NO] - [NO]....... .......[OKAY] -[OKAY]stochastic_transformer - .stochastic_transformerstochastic_transformer [NO] ......... [OKAY] [NO] -[NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. ..................[OKAY].................. -.................. [OKAY]-------------------------------------------------- [OKAY] - -[OKAY] ---------------------------------------------------op name - --------------------------------------------------- -------------------------------------------------- -op name................ - op name................installed op name ................ ..installed ................installed compatible -..installed.. -------------------------------------------------- -compatible..compatible - -compatible---------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... [YES] ...... cpu_adam[OKAY]cpu_adam -cpu_adam.............................. ...............[YES][YES] ......[YES]...... [OKAY]......[OKAY] -fused_adam - [OKAY]............. - [NO] ....... [OKAY] -fused_adamfused_adamfused_lamb .......................................fused_adam [NO][NO].............[NO] ..................... [NO][OKAY] [OKAY][OKAY] - -....... - fused_lamb[OKAY] fused_lamb -............. .............[NO] fused_lamb [NO] ....... sparse_attn .................... [OKAY] ............[NO] - [OKAY][NO]....... - .......[OKAY] -[OKAY] -transformer ............sparse_attn sparse_attn[NO]............ .......sparse_attn............[NO] [OKAY] .......[NO]............ - [OKAY][NO]....... - stochastic_transformer.......[OKAY] - transformer[OKAY] transformer............. - ............[NO][NO]transformer [NO].......................... [OKAY][OKAY]....... -[NO] - [OKAY]....... - stochastic_transformer[OKAY] -.stochastic_transformer [NO]stochastic_transformer ........ .[NO][OKAY] -[NO]....... .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -utils .................. async_io[YES] ..................... [NO][OKAY] -....... [NO] -quantizer .............. [NO] ....... [OKAY] ---------------------------------------------------transformer_inference - .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] async_io...... ...............[OKAY] -[NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] ---------------------------------------------------transformer_inference - .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference ..utils [NO].................. .......[YES] [OKAY]...... - [OKAY] -utils quantizer.................. ..............[YES] [NO]...... .......[OKAY] -[OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] - [OKAY] -utils ..................utils [YES].................. ...... [YES][OKAY] -...... [OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -quantizer ..............utils [NO].................. .......[YES] [OKAY]...... - [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja--------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -JIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] async_io....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] -.. [NO] ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. [YES] ...... quantizer[OKAY] -.............. [NO] .......quantizer [OKAY].............. - [NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils .................. utils[YES] .................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -ninjaninjaninja ....................................ninja ..................[OKAY] - [OKAY][OKAY]--------------------------------------------------.................. - - - [OKAY]--------------------------------------------------op name --------------------------------------------------- - --------------------------------------------------op name -................ - op nameinstalled................op name .................. installed................ installedcompatible .. installed -.. --------------------------------------------------compatible .. - -compatible --------------------------------------------------compatible - - --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... cpu_adamcpu_adam[OKAY]cpu_adam - ............................................. [YES][YES][YES] fused_adam...... ...... ................... [OKAY] -[OKAY][NO][OKAY] - -....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -fused_lambfused_adam ............. [NO]............. fused_adam fused_adam....... [NO] ............. [OKAY].................... -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - - [NO][NO][OKAY] -.............. [OKAY][OKAY]fused_lamb - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] - .............sparse_attn [NO]fused_lamb fused_lamb................... ............. .............[NO] [OKAY] [NO]....... -....... [OKAY] -[NO] ....... [OKAY] -.......[OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -transformer[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] -............ [NO] sparse_attn....... [OKAY]............ --------------------------------------------------- --------------------------------------------------- - [NO]sparse_attn ....... stochastic_transformersparse_attn ............ ............[OKAY] . - [NO][NO][NO]transformer ....... ................... .......[NO] ....... [OKAY][OKAY] [OKAY] - - -[OKAY] -transformertransformer ........................ stochastic_transformer [NO] [NO] ............... [OKAY][NO][OKAY] - -....... [OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO]  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`........ [NO] - -transformer_inference .. [NO] ....... [OKAY]async_io - ............... [NO] ....... utils[NO] -.................. [YES] ...... [OKAY] -quantizer transformer_inference.............. ..[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizerquantizer ............................ [NO] [NO]....... .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version -.................... torch version1.8.1 -.................... torch cuda version1.8.1 -............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -........... deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed info - ...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - torch 1.8, cuda 11.1deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1 -torch cuda version ...............torch cuda version 11.1............... - nvcc version11.1 -..................... nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... torch version1.8.1 -.................... torch cuda version1.8.1 -............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -................... deepspeed info0.4.2+bc17042, bc17042, big-science -................... deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -...... deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninja ninja.................................... [OKAY]....................................[OKAY] - - --------------------------------------------------[OKAY]-------------------------------------------------- -[OKAY] - -op name ---------------------------------------------------op name -------------------------------------------------- -................ - ................op nameinstalled op name ..installed ................ ................compatible .. -/bin/sh: line 0: type: git: not found -installed installed-------------------------------------------------- .. compatiblecompatible - - -..---------------------------------------------------------------------------------------------------- - -compatible --------------------------------------------------- -cpu_adam ............... cpu_adam[YES] cpu_adam......cpu_adam ............... [OKAY]...............[YES]............... - [YES][YES]...... ...... ...... [OKAY] [OKAY] -fused_adam[OKAY] - -............. [NO] ....... [OKAY] -fused_adamfused_lamb fused_adam..........................fused_adam .............[NO].............[NO] ....... [NO].......[NO] [OKAY].............. -[OKAY] [OKAY] - -[OKAY]fused_lamb -/bin/sh: line 0: type: git: not found - fused_lamb............. fused_lamb.............[NO] sparse_attn.............[NO]....... ............[NO].......[OKAY] -[NO] .......[OKAY] ....... - [OKAY][OKAY] - -transformer ............ sparse_attn[NO] .......sparse_attn............ [OKAY][NO]sparse_attn............ - [NO] ....... ............stochastic_transformer.......[OKAY] [NO] -[OKAY] . -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - transformer.......[NO]transformer [OKAY]............................... - [OKAY][NO][NO] -transformer .......................... [OKAY][NO][OKAY] - -.......stochastic_transformer stochastic_transformer [OKAY] -. .[NO] stochastic_transformer[NO]....... ....... [OKAY] -[OKAY]. - [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] async_io....... [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY]quantizer - .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path -............... torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda versiontorch version ................................... 11.11.8.1 - -nvcc version .....................torch cuda version 11.2............... - deepspeed install path11.1 -...........nvcc version .....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.2 -deepspeed infodeepspeed install path .............................. 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - ......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ...............DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch install path - ............... torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ............... torch version11.1 -....................nvcc version 1.8.1..................... - 11.2 -torch cuda versiondeepspeed install path .......................... 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']nvcc version - .....................deepspeed info 11.2................... - deepspeed install path0.4.2+bc17042, bc17042, big-science -........... deepspeed wheel compiled w. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...... -torch 1.8, cuda 11.1deepspeed info - ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install path deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -------------------------------------------------------------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -async_io-------------------------------------------------- -............... [NO] ....... [NO] -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - - ----------------------------------------------------------------------------------------------------- - -transformer_inference .. [NO] ....... [OKAY] ---------------------------------------------------op name-------------------------------------------------- -op name -................ op name ................op name installed installed ................ .................. ..installed installed compatible....compatible - -utils .................. [YES] ...... [OKAY] -----------------------------------------------------------------------------------------------------compatiblecompatible - - - ----------------------------------------------------------------------------------------------------- - -quantizer .............. [NO] ....... [OKAY] -cpu_adamcpu_adam ..............................cpu_adamcpu_adam [YES][YES].............................. ............[YES][YES] [OKAY][OKAY]............ - --------------------------------------------------- - [OKAY][OKAY] - -fused_adamfused_adam .......................... fused_adam[NO]fused_adam [NO] ........................................ [NO][OKAY][OKAY] -[NO] -....... .......fused_lambfused_lamb[OKAY] -[OKAY]..........................fused_lamb - [NO][NO]fused_lamb............. ........................... [NO] [OKAY] -[OKAY][NO]....... - .......[OKAY] -[OKAY] -sparse_attnsparse_attn ............sparse_attn............sparse_attn [NO][NO] ............ .......................... [NO] [NO][OKAY] [OKAY] - ....... -....... transformer [OKAY] [OKAY]transformer -............ - ............transformer[NO] [NO]transformer................... ....... [NO][OKAY] ............ -[OKAY] -.......[NO] stochastic_transformer [OKAY]stochastic_transformer ....... - .. [OKAY] stochastic_transformer[NO] -[NO] .............. stochastic_transformer [OKAY]. [OKAY] - -.[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY] -utils ..................quantizer [YES] .................... [NO] ....... [OKAY][OKAY] - --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- - -op nameop nameop name op name ................................ ................ ................ installed installedinstalled.. installed .. compatible .. -..compatible -------------------------------------------------- - -compatible--------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES]cpu_adam ......cpu_adam cpu_adam...............[OKAY] - ...............[YES]............... [YES]......[YES] ............ [OKAY][OKAY] - - [OKAY]fused_adam - ............. [NO] ....... [OKAY] -fused_adamfused_adam ..........................fused_lamb [NO][NO]fused_adam............. .......[NO] .................... [OKAY] .......[NO] -[OKAY] [OKAY] - -fused_lamb....... [OKAY]............. -fused_lamb [NO]............. fused_lamb ....... [NO] ............. [OKAY] .......[NO] -sparse_attn [OKAY]............ - .......[NO] [OKAY]....... - [OKAY] -sparse_attn transformer............ ............ sparse_attn [NO] [NO] ............ ....... ....... [NO][OKAY][OKAY]sparse_attn - - ....... transformerstochastic_transformer............ [OKAY]............ [NO] - .[NO].......transformer [NO] [OKAY]....... ............ -....... [NO][OKAY][OKAY] - - transformer....... stochastic_transformer ............ [OKAY] -[NO]. [NO] ....... .......stochastic_transformer[OKAY] -[OKAY] -. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info -...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -...... deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -JIT compiled ops requires ninja--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... ..................[OKAY] -[OKAY][OKAY][OKAY] - --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -op name -op name op name op name................ ................ ................ ................installed installedinstalled installed .. .... .. compatiblecompatible compatible - -compatible ----------------------------------------------------------------------------------------------------- - - --------------------------------------------------- --------------------------------------------------- -cpu_adam cpu_adam............... cpu_adam ............... [YES] ............... [YES] ...... [YES] ...... [OKAY] ...... -cpu_adam [OKAY] [OKAY] -............... - [YES]fused_adam ................... fused_adam fused_adam[OKAY][NO] - .......................... ....... [NO] [NO] [OKAY] ....... -....... [OKAY][OKAY] -fused_lamb - .............fused_lamb [NO]fused_lamb............. .......[NO] ............. [OKAY] ....... -[NO]fused_adam [OKAY].................... - [OKAY][NO] - ....... sparse_attn[OKAY] -............sparse_attn sparse_attn[NO]............ ...................[NO] [NO] fused_lamb[OKAY] ....... ....... - [OKAY][OKAY] - -.............transformer transformertransformer [NO] .................................... [NO][NO][NO] ............................ [OKAY][OKAY][OKAY][OKAY] - - - -stochastic_transformer stochastic_transformerstochastic_transformer. [NO].. .......[NO][NO] [OKAY].............. - [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... - 1.8.1 -torch version ....................torch cuda version 1.8.1............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info ................... 0.4.2+bc17042, bc17042, big-science........... -deepspeed wheel compiled w. ...... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch 1.8, cuda 11.1 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:torch version - .................... 1.8.1 -torch install pathtorch cuda version .............................. 11.1 -nvcc version ..................... 11.2 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed install path - ........... torch version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'].................... - deepspeed info1.8.1 -................... 0.4.2+bc17042, bc17042, big-sciencetorch cuda version - deepspeed wheel compiled w................ ......11.1 -torch 1.8, cuda 11.1nvcc version - ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference ..utils [NO].................. .......[YES] [OKAY]...... - [OKAY] -utilsquantizer ................................ [NO][YES] ............. [OKAY][OKAY] - --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - -ninjaninjaninjaninja .................. .................. .................. [OKAY].................. [OKAY] - [OKAY]-------------------------------------------------- -[OKAY] - - ---------------------------------------------------op name-------------------------------------------------- - -------------------------------------------------- -op name................ - op name ................op name installed ................ installed ................installed.... installed..compatiblecompatible - -compatible..-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -compatible --------------------------------------------------- -cpu_adamcpu_adamcpu_adam cpu_adam.............................. ............... ...............[YES][YES] [YES]......[YES]...... ......[OKAY]......[OKAY] -[OKAY] -[OKAY] - -fused_adam fused_adamfused_adamfused_adam............. [NO].......................... ............. .......[NO] [NO] [OKAY][NO].............. -....... [OKAY] fused_lamb[OKAY] - -[OKAY] -.............fused_lamb [NO]fused_lamb fused_lamb ............. ....... .......................... [NO] [NO].......[OKAY][NO] ....... - [OKAY] ....... -[OKAY] -[OKAY] -sparse_attnsparse_attn sparse_attn ........................ sparse_attn[NO]............ [NO] ............[NO] .............. .......[NO][OKAY][OKAY] - - [OKAY]....... -transformer [OKAY]............transformer -[NO] transformer ....... transformer........................ [OKAY] -............ [NO] [NO] stochastic_transformer[NO] ....... ....... .[OKAY][OKAY]....... - - [NO][OKAY]stochastic_transformer -....... stochastic_transformer .[OKAY] -stochastic_transformer[NO]. .......[NO] . [OKAY] -.......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................. .................................... ..................[OKAY][OKAY][OKAY] - -[OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -op name--------------------------------------------------op name -op name ................op name ................ ................installed................ installed..installed compatible -..installed.. -------------------------------------------------- compatible -.. -compatible --------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES] cpu_adam...... ...............[OKAY] cpu_adam -[YES]cpu_adam .................................... [YES][YES][OKAY] fused_adam -...... ...... ............. [OKAY] [OKAY] -[NO] - ....... fused_adam[OKAY] -............. [NO] fused_lamb....... fused_adam[OKAY]fused_adam............. - ..........................[NO]fused_lamb [NO].......[NO]............. .......[NO][OKAY]....... - [OKAY][OKAY] -....... - [OKAY] -fused_lambfused_lamb .......................... sparse_attn[NO][NO] .......................... [NO][OKAY][OKAY] sparse_attn - -....... ............[OKAY] -[NO] ....... transformer[OKAY] -............ [NO]transformersparse_attn ...................sparse_attn ........................[OKAY] [NO] - [NO][NO]....... [OKAY]stochastic_transformer -....... ....... . stochastic_transformer[NO][OKAY] [OKAY] -....... -. transformer [OKAY] transformer[NO] -............ .......[NO]............ [OKAY]....... - [NO][OKAY] -....... [OKAY] -stochastic_transformer stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY] [OKAY] -[OKAY] - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -op name -op name op nameop name................ ................................................installed installedinstalledinstalled .... compatible.. .. -compatible compatiblecompatible --------------------------------------------------- - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -cpu_adamcpu_adam cpu_adam...............cpu_adam ............... [YES]............... ............... [YES][YES]...... [YES] ......[OKAY]...... ...... -[OKAY] [OKAY] -[OKAY] - -fused_adam fused_adam............. fused_adam.............[NO] fused_adam ............. [NO]....... ............. .......[OKAY][NO] - [NO][OKAY]....... - fused_lamb[OKAY]....... - .............fused_lamb[OKAY] fused_lamb[NO] -............. ....................[NO] fused_lamb[NO].......[OKAY] ....... -.............[OKAY] -[OKAY][NO] - ....... [OKAY] -sparse_attn ............ [NO]sparse_attnsparse_attn ............................... [OKAY]sparse_attn[NO][NO] - .......................... transformer [NO] [OKAY] [OKAY] -................... - [OKAY][NO] - transformer.......transformertransformer [OKAY].................................... - [NO][NO] [NO] .............. stochastic_transformer....... [OKAY] [OKAY] -[OKAY]. - - [NO] stochastic_transformer....... stochastic_transformer stochastic_transformer [OKAY] . - ..[NO] [NO][NO]....... ..............[OKAY] -[OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version ....................torch version 1.8.1.................... - 1.8.1 -torch cuda version ...............torch cuda version 11.1............... - 11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.1 -nvcc version ..................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info - deepspeed wheel compiled w.................... ......0.4.2+bc17042, bc17042, big-science -torch 1.8, cuda 11.1deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO]transformer_inference ....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path ............... - torch install path ............... torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version1.8.1 -....................torch version torch cuda version 1.8.1 .................... -............... 1.8.1torch cuda version11.1 - -...............nvcc versiontorch cuda version 11.1 ............... -..................... nvcc version 11.1 11.2 -..................... -nvcc version deepspeed install path 11.2 ..................... -........... deepspeed install path 11.2 ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...........deepspeed install path - deepspeed info........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] ................... - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info0.4.2+bc17042, bc17042, big-science - -deepspeed info...................deepspeed wheel compiled w. .........................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1deepspeed wheel compiled w. - - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.1 -1.8.1 -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY] -[OKAY][OKAY] - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op nameop name - op nameop name ................................ ................................ installed installed installedinstalled .. .. ..compatible.. -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - - --------------------------------------------------compatiblecompatiblecompatible - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op nameop name op nameop name ................................ ................installed................installed installed.. installed.... compatiblecompatible -..compatible --------------------------------------------------- - --------------------------------------------------- --------------------------------------------------compatible - - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam cpu_adam...... cpu_adam[OKAY] ............... -cpu_adam ............... [YES]cpu_adam cpu_adamcpu_adam...... ...............[OKAY]............... ............... -............... ............... [YES] [YES] [YES] ...... ............ [OKAY]fused_adam[OKAY][OKAY] - - - [YES][YES][YES] ............ ......[OKAY] -[OKAY][OKAY]fused_adam - -............. [NO] ....... [OKAY] - ............. [NO] ....... [OKAY]fused_adam -fused_adamfused_lamb fused_adam............. fused_adam ............. .............[NO].............[NO] .......[NO][NO]....... [OKAY] -.......[OKAY] -.......[OKAY]fused_lamb - [OKAY]............. - fused_adam.............fused_adam ..........................fused_lamb[NO] [NO].......[NO]............. .......[OKAY]....... -[NO] [OKAY][OKAY]fused_lamb....... - - fused_lamb[NO] ....................fused_lamb sparse_attn [OKAY][NO]............. - .............fused_lamb [OKAY]fused_lamb[NO] -............ .......[NO][NO] [OKAY].............. - .......................... ....... [NO] [NO] [OKAY] ....... - [OKAY][OKAY] - -sparse_attntransformer ........................ [NO][NO] .............. [OKAY]sparse_attn[OKAY] - - .......[OKAY] -[OKAY]sparse_attn - ............ [NO] ....... [OKAY] - transformersparse_attn............ stochastic_transformer............ ............ [NO].[NO] [NO] [NO] ............................ [OKAY][OKAY][OKAY][OKAY] - -sparse_attn transformersparse_attn ............ ............ sparse_attn............ [NO][NO] ............ [NO]....... ....... [NO][OKAY].......[OKAY] - - - -[OKAY]....... transformer -[OKAY]stochastic_transformer -transformerstochastic_transformer transformer............ . ............ [NO] [NO] [NO] .............. .......[OKAY][OKAY] - -[OKAY] - ............transformer [NO].transformer ............ [NO] ....... ............ ....... [NO] [OKAY] [OKAY][NO] -....... -stochastic_transformer .stochastic_transformer [NO] ........ [OKAY][NO] - ....... [OKAY] - stochastic_transformer.......[OKAY] -[OKAY]. - stochastic_transformer[NO] stochastic_transformer....... . [OKAY] [NO]. - .......[NO] [OKAY]....... - [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -torch install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -............... torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ...............torch version 11.1.................... - 1.8.1nvcc version - ..................... torch cuda version11.2 -...............deepspeed install path 11.1........... - nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - 11.2deepspeed info - deepspeed install path................... ...........0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - ......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ------------------------------------------------------------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report------------------------------------------------------------------------------------------------------------------------------------------------------ - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninja ninja...................................................... .................. [OKAY][OKAY][OKAY] -[OKAY] - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -op name op name................op nameop name ................................................installed installed..installedinstalled ..compatible.... - compatible--------------------------------------------------compatible - - -compatible---------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam...... cpu_adam...............[OKAY] cpu_adam -[YES]............... .....................[YES] [OKAY]......[YES] - [OKAY]fused_adam...... - ............. [OKAY][NO]fused_adam - .................... fused_adam [OKAY] [NO] -............. .......[NO] [OKAY]fused_lamb....... -fused_adam .............fused_lamb [OKAY] .......................... -[NO] [NO]fused_lamb.......[NO] ..............[OKAY]............. - [OKAY][NO] - [OKAY]....... -[OKAY] -fused_lamb .............sparse_attn [NO]............ sparse_attn[NO]....... ...................sparse_attn [OKAY][NO]............[OKAY] - -[NO]....... transformer.......[OKAY] -............[OKAY] -[NO]transformer transformer...................sparse_attn ............[OKAY][NO] -............[NO] ....... [NO].......stochastic_transformer[OKAY] -[OKAY]....... -. stochastic_transformer [OKAY]stochastic_transformer [NO] - ......... [NO][NO][OKAY] transformer....... -............ ....... [OKAY] [NO] -[OKAY] -....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] - -[OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op nameop name ................op name................................ installedinstalled................installed ......installed compatible compatiblecompatible.. - - - ------------------------------------------------------------------------------------------------------------------------------------------------------compatible - - - --------------------------------------------------- -cpu_adam cpu_adam...............cpu_adam cpu_adam[YES] ............... .....................[YES]............... [OKAY][YES]......[YES] - [OKAY]............ - [OKAY] -[OKAY] -fused_adam ............. [NO]fused_adam .................... [OKAY]fused_adamfused_adam[NO] - .................... .............fused_lamb[OKAY][NO] - [NO] ............. ....... fused_lamb....... [NO] [OKAY] .............[OKAY]....... - - [NO][OKAY] -.......fused_lamb fused_lamb [OKAY] ............. -............. [NO][NO] .............. [OKAY]sparse_attn[OKAY] - -............ [NO] .......sparse_attn [OKAY]............ - [NO] transformer....... ............[OKAY] sparse_attnsparse_attn - [NO] ............................... transformer [NO] [NO][OKAY] ............ -....... ....... [NO] [OKAY][OKAY]....... -stochastic_transformer - [OKAY]transformer -. transformer ............[NO] stochastic_transformer [NO]............ ....... . .......[NO] [OKAY] [NO] -[OKAY] ....... -....... [OKAY][OKAY]stochastic_transformer - - . [NO]stochastic_transformer ....... .[OKAY] - [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- - - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] -[OKAY] -[OKAY]-------------------------------------------------- - --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name - -op name op name op name................ ................ ................ ................installedinstalled installed installed .. ....compatible ..compatible - -compatible -------------------------------------------------- ---------------------------------------------------compatible - --------------------------------------------------- - --------------------------------------------------- -cpu_adam cpu_adam...............cpu_adam cpu_adam...............[YES] ...............[YES]...... ............... ......[YES] [OKAY] [YES] -[OKAY]...... -...... [OKAY][OKAY] - -fused_adam ............. fused_adam[NO] fused_adamfused_adam .................... .............[OKAY][NO]............. - [NO].......[NO] fused_lamb ....... [OKAY]....... ............. -ninjaninjaninjaninja ........................................................................ [OKAY] - [OKAY] [OKAY]fused_lamb - -[NO] .................... fused_lambfused_lamb [NO][OKAY]............. -[OKAY][OKAY][OKAY] --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- -op name-------------------------------------------------- - ....................[NO] [OKAY] [NO] -....... .......[OKAY] - op name -[OKAY] - ................op nameop name................ installed................installed................ installed.... installed ..compatiblecompatible .. - -compatible ----------------------------------------------------------------------------------------------------compatible - -sparse_attn ............ sparse_attnsparse_attn[NO]sparse_attn ............ ................... ............ [NO][NO] [OKAY] - - --------------------------------------------------- --------------------------------------------------- -[NO].............. .......transformer[OKAY] [OKAY] - -............[OKAY] transformer -cpu_adam ...............cpu_adam cpu_adam [YES]cpu_adam ............... .....................[YES]............... [YES] [OKAY][YES] ...... -transformer[NO] ...............................transformer [NO][NO]............[OKAY] -...... [OKAY]......[OKAY] - - [OKAY] -..............[NO] [OKAY][OKAY]....... -stochastic_transformer - [OKAY] -fused_adam ............. [NO] .......fused_adam fused_adam fused_adam[OKAY] ............. -.stochastic_transformer stochastic_transformer [NO] stochastic_transformer . ....... . [NO] . [OKAY][NO]....... - [NO][OKAY]....... -............. [NO].............fused_lamb[NO] .......[NO].................... .......[OKAY][OKAY][NO] - - [OKAY]....... - fused_lamb[OKAY] fused_lambfused_lamb............. - .......[OKAY] -[OKAY] - [NO].......................... .......[NO][NO] [OKAY]....... - .......sparse_attn[OKAY] -[OKAY]............ - [NO] ....... [OKAY] -transformer ............ [NO]sparse_attn sparse_attn ............................... sparse_attn [OKAY][NO] [NO] - ............ ....... ....... stochastic_transformer[NO] [OKAY][OKAY] -. -....... transformer[OKAY][NO]transformer - ...................transformer............ [OKAY][NO] -[NO]............ ..............[NO] [OKAY] [OKAY] - -....... [OKAY] -stochastic_transformer stochastic_transformer . stochastic_transformer.[NO] [NO]........ [OKAY].......[NO] - [OKAY]....... - [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] ----------------------------------------------------------------------------------------------------- - -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name -op name op name ................ op name................ ................ installed installed installed .................. .. ..compatible installed -compatible compatible--------------------------------------------------.. - --------------------------------------------------- --------------------------------------------------- - -compatible --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adam[YES]............... ...............cpu_adam......[YES] [YES]............... [OKAY] ...... -......[YES] [OKAY] [OKAY] -...... - [OKAY] -fused_adam ............. [NO] ....... fused_adam[OKAY]fused_adam -fused_adam .......................... fused_lamb .............[NO][NO] [NO] ............. .....................[NO] [OKAY][OKAY][OKAY]....... - - -[OKAY]fused_lamb -fused_lamb fused_lamb ............. ............. ............. [NO] [NO][NO]....... .......sparse_attn.......[OKAY] -[OKAY]............[OKAY] - -[NO] ....... [OKAY] -transformer ............ [NO] ....... sparse_attn[OKAY]sparse_attn sparse_attn - .................................... stochastic_transformer [NO] [NO] [NO]....... ........ ....... [OKAY][NO] [OKAY] -[OKAY] - -.......transformer transformer............transformer [OKAY] ............ -[NO] ............ [NO] ....... [NO] ....... [OKAY] ....... -[OKAY] -[OKAY] -stochastic_transformerstochastic_transformer stochastic_transformer .. . [NO] [NO] [NO]....... ..............[OKAY] -[OKAY][OKAY] - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. quantizer[YES] .................... [NO][OKAY] -....... [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY]utils - .................. quantizer[YES] .................... [NO][OKAY] -....... [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - - op name................op name op name................ installed ................ installed ................installed.. .. installed..compatible -compatiblecompatible--------------------------------------------------.. - - - ----------------------------------------------------------------------------------------------------compatible - - --------------------------------------------------- -cpu_adam ............... [YES] ......cpu_adam cpu_adamcpu_adam[OKAY] .............................. -............... [YES] [YES] [YES] ...... ............ [OKAY] fused_adam -[OKAY][OKAY] - -............. [NO] ....... [OKAY] -fused_adamfused_lambfused_adam fused_adam ............. ............. .......................... [NO] [NO] [NO].......[NO]....... .......[OKAY][OKAY]....... - - [OKAY][OKAY] - -fused_lamb ............. fused_lambfused_lamb[NO] ................................. [NO] sparse_attn [OKAY] [NO]....... -............ .......[NO][OKAY] -.......[OKAY] -[OKAY] -transformer ............ [NO] sparse_attn....... ............[OKAY] -sparse_attn[NO]sparse_attn stochastic_transformer............................... [NO].[OKAY] -[NO] ....... [NO] .......transformer[OKAY]....... -............[OKAY] transformer -[OKAY] [NO] -transformer ................... ............[OKAY] [NO] -[NO] .............. stochastic_transformer[OKAY] -[OKAY] -. [NO] stochastic_transformer.......stochastic_transformer [OKAY] -.. [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY] - -[OKAY][OKAY]---------------------------------------------------------------------------------------------------- - - - -op name-------------------------------------------------- --------------------------------------------------op name -................ - op name................op nameinstalled ..................................installed installedcompatible.. installed - --------------------------------------------------....compatible - -compatiblecompatible - --------------------------------------------------- -----------------------------------------------------------------------------------------------------cpu_adam - - ............... [YES] ...... [OKAY] -cpu_adam ...............cpu_adamcpu_adam [YES].............................. ......[YES][YES] fused_adam ...... [OKAY]...... -.............[OKAY] -[OKAY][NO] -....... [OKAY] -fused_adamfused_lamb .............fused_adam............. fused_adam[NO][NO]............. ....................[NO] ....... [OKAY] -[NO].......[OKAY] -.......[OKAY] -[OKAY] -fused_lamb .............fused_lamb fused_lamb [NO]............. sparse_attn [NO]................................ .......[NO][NO][OKAY] -.......[OKAY]....... - [OKAY][OKAY] - -transformer ............ [NO] sparse_attn....... sparse_attn............[OKAY] - ............sparse_attn[NO] stochastic_transformer [NO]............ ....... ........ [NO] [OKAY][NO][OKAY] - -.............. transformertransformer [OKAY][OKAY] - ............ -............ [NO]transformer [NO] ....... ............ ....... [OKAY][NO][OKAY] - -....... [OKAY] -stochastic_transformer stochastic_transformer .stochastic_transformer. [NO][NO]. .......[NO]....... .......[OKAY][OKAY] - -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... quantizer[OKAY] -.............. [NO] .......quantizer [OKAY].............. - [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY] -[OKAY] -quantizer ..............quantizer [NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']DeepSpeed general environment info: -torch version - .................... 1.8.1 -torch install pathtorch cuda version .............................. 11.1 -nvcc version ..................... 11.2['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed install path ...........torch version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'].................... - deepspeed info1.8.1 -................... 0.4.2+bc17042, bc17042, big-sciencetorch cuda version - ...............deepspeed wheel compiled w. ......11.1 -torch 1.8, cuda 11.1nvcc version - ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninja--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] --------------------------------------------------- - --------------------------------------------------- - ---------------------------------------------------op name-------------------------------------------------- -op name - ................op name................op name installedinstalled................................ .. ..installed installed compatible -..compatible..-------------------------------------------------- - compatible ---------------------------------------------------compatible - --------------------------------------------------- - --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY]cpu_adam -cpu_adamcpu_adam .............................. ............... [YES] [YES] [YES]...... fused_adam ...... ......[OKAY][OKAY] ............. - - [OKAY][NO] - ....... [OKAY] -fused_lambfused_adam fused_adam .......................... fused_adam .............[NO] [NO] .................... [NO] ....... [NO][OKAY] ....... - [OKAY] -.......[OKAY] -fused_lamb[OKAY] - fused_lamb............. fused_lamb.............[NO] .............sparse_attn[NO]....... [OKAY]...................[NO] - [NO]....... [OKAY]....... - [OKAY][OKAY] - -transformer ............sparse_attn [NO]............ .......[NO] sparse_attn [OKAY]....... - sparse_attn ............[OKAY] -stochastic_transformer[NO]............ transformer........ [NO] ............[NO] [OKAY] .......[NO]....... - transformer.......[OKAY] [OKAY] - ............ -[OKAY] transformer -[NO] ................... [NO][OKAY] stochastic_transformer -....... [OKAY]. -stochastic_transformer [NO] stochastic_transformer........ [OKAY][NO]. - .......[NO] [OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1DeepSpeed general environment info: - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -ninjaninjaninjaninja ...................................................... ..................[OKAY][OKAY][OKAY] - - -nvcc versionnvcc version .......................................... 11.211.2 - -[OKAY]------------------------------------------------------------------------------------------------------------------------------------------------------ - - - ---------------------------------------------------op nameop nameop name -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - ................................op name ................ installedinstalledinstalled ................ .... .. installed compatiblecompatible ..compatible - - -----------------------------------------------------------------------------------------------------compatible - --------------------------------------------------- - - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- -cpu_adamcpu_adamcpu_adam ............... cpu_adam............... [YES]............... ............... [YES][YES] ............ [YES] [OKAY]......[OKAY] - -......[OKAY] -[OKAY] -fused_adamfused_adam .............fused_adam............. [NO] fused_adam....................[NO] [OKAY][NO] -.................... fused_lamb[OKAY].......[NO] -............. [OKAY]fused_lamb[NO]....... - .......[OKAY]............. - fused_lamb[NO] [OKAY]fused_lamb -.................... .............[NO][OKAY] -[NO] ....... .......[OKAY] -[OKAY] -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] -[NO] sparse_attn.......transformer sparse_attn ............[OKAY] ............ -............ [NO] transformer[NO][NO] ....... ................... ....... [OKAY][OKAY] [NO] - -[OKAY] -.......transformer transformer[OKAY] ............ -stochastic_transformer............ [NO]stochastic_transformer[NO] . ....... ........ [NO] [OKAY][NO].......[OKAY] - -[OKAY]....... - [OKAY]stochastic_transformerstochastic_transformer - .. [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO] -....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -.............. [NO] ....... [OKAY]quantizer - .............. [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja - - ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name................................ ................................installed installed installed.. installed .... compatible compatible -..compatible --------------------------------------------------- - - ---------------------------------------------------------------------------------------------------- -compatible - --------------------------------------------------- -cpu_adam ............... [YES] ......cpu_adam cpu_adam [OKAY] ...............cpu_adam............... - ...............[YES][YES] [YES]............ ......[OKAY][OKAY] -fused_adam - [OKAY]............. -[NO] .......fused_adam [OKAY]............. -fused_adam .............fused_adam [NO][NO] fused_lamb............. ....... ....................[NO][OKAY] - [OKAY][NO]....... - fused_lamb ....... [OKAY] .............fused_lamb -[OKAY] -.............fused_lamb[NO] [NO]....... ............. .......[OKAY][NO] - sparse_attn[OKAY]....... - ............[OKAY] [NO] - ....... [OKAY] -sparse_attntransformer ........................ sparse_attn [NO][NO] sparse_attn .......................... ............[OKAY][NO][OKAY] - -[NO].......stochastic_transformer .......[OKAY]transformer . -[OKAY] ............ -[NO] transformer[NO] transformer....... ............................... [OKAY] - [OKAY][NO][NO] - .............. [OKAY][OKAY]stochastic_transformer - - . stochastic_transformer[NO] stochastic_transformer ........ .[NO][OKAY] -[NO]....... .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.async_io ............... - [NO] ....... [NO] -async_iotransformer_inference ................. [NO][NO] .............. [NO][OKAY] - -utils .................. [YES] ......transformer_inference [OKAY].. - [NO] ....... quantizer[OKAY] -.............. [NO] ....... utils[OKAY] -.................. [YES] --------------------------------------------------...... - [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -------------------------------------------------------------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninjaninjaninja ninja ...................................................... [OKAY].................. [OKAY] [OKAY] - -[OKAY] --------------------------------------------------- - -------------------------------------------------------------------------------------------------------------------------------------------------------op name - - -................op nameop name op name installed ................ ................................ .. installedinstalledinstalledcompatible -.. ..-------------------------------------------------- .. - compatiblecompatiblecompatible - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... [YES] ...... cpu_adam[OKAY] cpu_adamcpu_adam............... - ...............[YES]............... ......[YES][YES] [OKAY]...... -...... fused_adam [OKAY] [OKAY] -............. - [NO] fused_adam....... .............[OKAY] -[NO] fused_adam.......fused_lamb [OKAY] fused_adam............. -............. [NO].............[NO] fused_lamb ....... .......[OKAY][NO]............. - [OKAY].......[NO] - .......[OKAY] -[OKAY]fused_lamb - fused_lamb............. .............sparse_attn[NO] [NO]................... [NO] ....... [OKAY].......[OKAY]sparse_attn - -[OKAY] ............ [NO] ....... [OKAY] - -transformertransformer ........................ [NO][NO] .............. [OKAY]sparse_attn[OKAY]sparse_attn - - ........................ [NO]stochastic_transformer [NO]stochastic_transformer ....... ........[OKAY]. - [NO][OKAY][NO]transformer -....... ....... [OKAY]transformer[OKAY] -............ - ............[NO] [NO]....... .......[OKAY] -[OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path -torch install path .............................. torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - ........................................ 1.8.1torch version1.8.1 - -....................torch cuda versiontorch cuda version 1.8.1.............................. - 11.111.1 -torch cuda version -nvcc version nvcc version ............... ..................... ..................... 11.1 11.2 -11.2 - -nvcc versiondeepspeed install pathdeepspeed install path ........................................... 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - - deepspeed info...........deepspeed info ...................................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. deepspeed wheel compiled w. ................... ...... ......0.4.2+bc17042, bc17042, big-science -torch 1.8, cuda 11.1torch 1.8, cuda 11.1 -deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... ....... [NO][NO] -....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ...................DeepSpeed general environment info: 0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch install pathtorch 1.8, cuda 11.1 -............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path DeepSpeed general environment info:........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info ................... torch install path0.4.2+bc17042, bc17042, big-science -...............deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO]transformer_inference ....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] quantizer...... ..............[OKAY] - [NO] .......quantizer [OKAY].............. - [NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference ..utils [NO].................. .......[YES] [OKAY]...... - [OKAY] -quantizerutils ................................ [NO][YES] ............. [OKAY][OKAY] - --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.utils - .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY]async_io - ............... --------------------------------------------------[NO] - ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path -............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] - -[OKAY] --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -op nameop nameop name op name ................ ................................installed ................ installed .. installedinstalled .. compatible .... - compatible --------------------------------------------------compatible -compatible - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adam ............... cpu_adamcpu_adamcpu_adam[YES] .............................. ...... [YES][YES]...............[OKAY] -............[YES] [OKAY][OKAY]...... - - [OKAY] -fused_adam ............. fused_adamfused_adam[NO] fused_adam ................................. .............[NO][NO][OKAY] -.......[NO]....... [OKAY]fused_lamb -.......[OKAY] -.............[OKAY]fused_lamb -fused_lamb[NO] .................................fused_lamb [NO][OKAY][NO] -............. ..............[NO] [OKAY][OKAY]....... - - [OKAY] -sparse_attn ............ [NO] sparse_attn....... sparse_attnsparse_attn [OKAY] ........................ -............ [NO][NO][NO]transformer ................................. [OKAY][NO][OKAY][OKAY] - -....... -transformer transformer[OKAY]transformer - .................................... [NO]stochastic_transformer[NO] [NO] ..................... [OKAY].[OKAY][OKAY] - - -[NO] stochastic_transformer....... stochastic_transformerstochastic_transformer[OKAY] . - [NO]. ........ [NO] [NO] [OKAY] ....... -....... [OKAY][OKAY] - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -using world size: 256, data-parallel-size: 8, tensor-model-parallel size: 4, pipeline-model-parallel size: 8 -using torch.float16 for parameters ... ------------------------- arguments ------------------------ - accumulate_allreduce_grads_in_fp32 .............. False - adam_beta1 ...................................... 0.9 - adam_beta2 ...................................... 0.999 - adam_eps ........................................ 1e-08 - adlr_autoresume ................................. False - adlr_autoresume_interval ........................ 1000 - apply_query_key_layer_scaling ................... True - apply_residual_connection_post_layernorm ........ False - attention_dropout ............................... 0.1 - attention_softmax_in_fp32 ....................... False - bert_binary_head ................................ True - bert_load ....................................... None - bf16 ............................................ False - bias_dropout_fusion ............................. True - bias_gelu_fusion ................................ True - biencoder_projection_dim ........................ 0 - biencoder_shared_query_context_model ............ False - block_data_path ................................. None - checkpoint_activations .......................... True - checkpoint_in_cpu ............................... False - checkpoint_num_layers ........................... 1 - clip_grad ....................................... 1.0 - codecarbon_dir .................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-data/codecarbon - consumed_train_samples .......................... 0 - consumed_valid_samples .......................... 0 - contigious_checkpointing ........................ False - cpu_optimizer ................................... False - cpu_torch_adam .................................. False - data_impl ....................................... mmap - data_parallel_size .............................. 8 - data_path ....................................... ['/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document'] - dataloader_type ................................. single - DDP_impl ........................................ local - decoder_seq_length .............................. None - deepscale ....................................... False - deepscale_config ................................ None - deepspeed ....................................... True - deepspeed_activation_checkpointing .............. True - deepspeed_config ................................ ./ds_config.1161730.json - deepspeed_mpi ................................... False - distribute_checkpointed_activations ............. False - distributed_backend ............................. nccl - embedding_path .................................. None - encoder_seq_length .............................. 2048 - eod_mask_loss ................................... False - eval_interval ................................... 1000 - eval_iters ...................................... 5 - evidence_data_path .............................. None - exit_duration_in_mins ........................... 110 - exit_interval ................................... None - ffn_hidden_size ................................. 20480 - finetune ........................................ False - fp16 ............................................ True - fp16_lm_cross_entropy ........................... False - fp32_residual_connection ........................ False - global_batch_size ............................... 2048 - hidden_dropout .................................. 0.1 - hidden_size ..................................... 16384 - hysteresis ...................................... 2 - ict_head_size ................................... None - ict_load ........................................ None - img_dim ......................................... 224 - indexer_batch_size .............................. 128 - indexer_log_interval ............................ 1000 - init_method_std ................................. 0.02 - init_method_xavier_uniform ...................... False - initial_loss_scale .............................. 4294967296 - kv_channels ..................................... 512 - layernorm_epsilon ............................... 1e-05 - lazy_mpu_init ................................... None - load ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - local_rank ...................................... 0 - log_batch_size_to_tensorboard ................... True - log_interval .................................... 1 - log_learning_rate_to_tensorboard ................ True - log_loss_scale_to_tensorboard ................... True - log_num_zeros_in_grad ........................... False - log_params_norm ................................. False - log_timers_to_tensorboard ....................... True - log_validation_ppl_to_tensorboard ............... True - loss_scale ...................................... 12.0 - loss_scale_window ............................... 1000 - lr .............................................. 6e-05 - lr_decay_iters .................................. None - lr_decay_samples ................................ 126953125 - lr_decay_style .................................. cosine - lr_warmup_fraction .............................. None - lr_warmup_iters ................................. 0 - lr_warmup_samples ............................... 216320 - make_vocab_size_divisible_by .................... 128 - mask_prob ....................................... 0.15 - masked_softmax_fusion ........................... True - max_position_embeddings ......................... 2048 - memory_centric_tiled_linear ..................... False - merge_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt - micro_batch_size ................................ 1 - min_loss_scale .................................. 1.0 - min_lr .......................................... 6e-06 - mmap_warmup ..................................... False - no_load_optim ................................... None - no_load_rng ..................................... None - no_save_optim ................................... None - no_save_rng ..................................... None - num_attention_heads ............................. 32 - num_channels .................................... 3 - num_classes ..................................... 1000 - num_layers ...................................... 32 - num_layers_per_virtual_pipeline_stage ........... None - num_workers ..................................... 2 - onnx_safe ....................................... None - openai_gelu ..................................... False - optimizer ....................................... adam - override_lr_scheduler ........................... False - params_dtype .................................... torch.float16 - partition_activations ........................... False - patch_dim ....................................... 16 - pipeline_model_parallel_size .................... 8 - position_embedding_type ......................... PositionEmbeddingType.absolute - profile_backward ................................ False - query_in_block_prob ............................. 0.1 - rampup_batch_size ............................... ['16', '16', '6_000_000'] - rank ............................................ 0 - remote_device ................................... none - reset_attention_mask ............................ False - reset_position_ids .............................. False - retriever_report_topk_accuracies ................ [] - retriever_score_scaling ......................... False - retriever_seq_length ............................ 256 - sample_rate ..................................... 1.0 - save ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - save_interval ................................... 1500 - scatter_gather_tensors_in_pipeline .............. True - scattered_embeddings ............................ False - seed ............................................ 42 - seq_length ...................................... 2048 - sgd_momentum .................................... 0.9 - short_seq_prob .................................. 0.1 - split ........................................... 949,50,1 - split_transformers .............................. False - synchronize_each_layer .......................... False - tensor_model_parallel_size ...................... 4 - tensorboard_dir ................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-data/tensorboard - tensorboard_log_interval ........................ 1 - tensorboard_queue_size .......................... 5 - tile_factor ..................................... 1 - titles_data_path ................................ None - tokenizer_name_or_path .......................... None - tokenizer_type .................................. GPT2BPETokenizer - train_iters ..................................... None - train_samples ................................... 300000000 - use_checkpoint_lr_scheduler ..................... False - use_contiguous_buffers_in_ddp ................... False - use_cpu_initialization .......................... None - use_one_sent_docs ............................... False - use_pin_memory .................................. False - virtual_pipeline_model_parallel_size ............ None - vocab_extra_ids ................................. 0 - vocab_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json - weight_decay .................................... 0.1 - world_size ...................................... 256 - zero_allgather_bucket_size ...................... 0.0 - zero_contigious_gradients ....................... False - zero_reduce_bucket_size ......................... 0.0 - zero_reduce_scatter ............................. False - zero_stage ...................................... 1 --------------------- end of arguments --------------------- -will use batch size rampup starting from global batch size 16 to global batch size 2048 with batch size increments 16 over 6000000 samples. -> building GPT2BPETokenizer tokenizer ... -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... DeepSpeed general environment info:['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info - ................... 0.4.2+bc17042, bc17042, big-science -torch install pathdeepspeed wheel compiled w. ..................... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop name op name................................ ................ installed................ installed installed.. installed .. .. compatible.. compatible - compatible--------------------------------------------------compatible - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adamcpu_adam[YES] ................................................... [YES][YES] [OKAY] [YES]...... -...... ......[OKAY][OKAY] -[OKAY] - -fused_adam ............. [NO] fused_adamfused_adam.......fused_adam ............. ............. .............[OKAY] [NO] [NO][NO] - .......fused_lamb.............. [OKAY].............[OKAY] -[OKAY] - -[NO]fused_lamb .......fused_lamb............. fused_lamb [OKAY] ............. -[NO]............. [NO].......[NO] ..............[OKAY] -[OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attntransformersparse_attnsparse_attn ........................ [NO] ............................... [NO][OKAY][NO][NO] - ....... ....... transformer.......[OKAY] -[OKAY]............[OKAY] - - stochastic_transformer[NO] transformer.......transformer. ............ ............ [NO][OKAY] -[NO][NO]....... .......stochastic_transformer[OKAY]....... - [OKAY]. -[OKAY] -[NO] .......stochastic_transformer [OKAY]stochastic_transformer - . [NO]. .......[NO] [OKAY]....... - [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja ninja.................. ..................[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -op name op name................ ................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY] -[OKAY] -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attnsparse_attn ............ ............[NO] [NO]....... .......[OKAY] - [OKAY] -transformertransformer ............ ............[NO] [NO]....... .......[OKAY] - [OKAY] -stochastic_transformer stochastic_transformer . [NO]. ....... [NO][OKAY] - ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninjaJIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................. ....................................[OKAY].................. - [OKAY][OKAY][OKAY] - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op name op nameop name................................ ................ ................installed installedinstalledinstalled.. ..compatible.... - --------------------------------------------------compatiblecompatiblecompatible - - - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam......cpu_adam ...............cpu_adam[OKAY]............... - [YES][YES] ............... ............ [OKAY] [OKAY] -[YES] -fused_adam ................... [OKAY][NO] - ....... [OKAY]fused_adam -fused_adam ..........................fused_lamb .............[NO][NO] [NO]....... ....... ....... fused_adam[OKAY] [OKAY] -[OKAY] - - .............fused_lamb fused_lamb [NO] ............. ............. .......[NO] sparse_attn [NO][OKAY] ....... ............ -[OKAY]....... -[NO]fused_lamb[OKAY] -....... .............[OKAY] -[NO] .......transformer sparse_attn............[OKAY] sparse_attn............[NO] - ...................[NO] [NO] [OKAY] ....... -....... [OKAY][OKAY] -stochastic_transformer - transformertransformer . sparse_attn ............ [NO] ........................[NO] [NO].............. .......[OKAY][NO][OKAY] - -[OKAY] -stochastic_transformer.......stochastic_transformer [OKAY] . - .[NO] [NO]transformer....... .......[OKAY] - [OKAY]............ -[NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninja -JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name -op name op name ................op name ................ installed ................installed .................. .. installedcompatiblecompatibleinstalled - - ..----------------------------------------------------------------------------------------------------.. - - compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam .............................. cpu_adam [YES]cpu_adam [YES] ...... .............................. ......[OKAY] -[YES][YES][OKAY] -............ [OKAY][OKAY] - -fused_adam ............. [NO] fused_adam....... fused_adamfused_adam .............[OKAY]............. .............[NO] - [NO]....... [NO] .......fused_lamb [OKAY] ....... -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -[OKAY]............. -[OKAY]fused_lamb[NO] - fused_lamb.................... .............fused_lamb[NO][OKAY] [NO]....... -............. [OKAY].......[NO] - [OKAY]....... - [OKAY] -sparse_attn ............ sparse_attn[NO] ................... sparse_attn[OKAY]sparse_attn[NO] - ........................transformer....... [NO][OKAY]............[NO] - [NO].............. transformer....... [OKAY] ............[OKAY] -[OKAY][NO] - - transformer.......transformer stochastic_transformer[OKAY]........................ - [NO][NO] . stochastic_transformer ....... [NO]....... [OKAY]. .......[OKAY] -[NO][OKAY] - -....... [OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninja ninja.................. ninja.................. ....................................[OKAY] [OKAY] -[OKAY] -[OKAY]-------------------------------------------------- --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op nameop name - - op name................................ op nameinstalled ................ installed..installed ....................compatible -compatiblecompatible -installed --------------------------------------------------- ---------------------------------------------------------------------------------------------------- -.. - compatible - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adam ............... cpu_adam[YES] .....................cpu_adam [YES][OKAY]............... -...... cpu_adam[YES][OKAY] -async_io ............... [NO] ....... [NO] -fused_adam...... ............................[OKAY] -[NO] .......[YES] [OKAY]fused_adam -...... .............[OKAY] fused_lamb -transformer_inference .. [NO] ....... [OKAY] -[NO] fused_adam .......................... .......[NO] [NO] [OKAY] .............. -fused_lambfused_adam [OKAY] -[OKAY]............. -utils .................. [YES] ...... [OKAY] -............. [NO] fused_lamb.......[NO] ....................[OKAY] - sparse_attn[NO][OKAY] -quantizer .............. [NO] ....... [OKAY] - ............ .......fused_lamb[NO] [OKAY].................... --------------------------------------------------- - [OKAY][NO] -sparse_attn ............ [NO]transformer .......................... [NO][OKAY] [OKAY] -....... -sparse_attn [OKAY] transformer - ........................ stochastic_transformer [NO][NO]. .............. [NO] [OKAY]sparse_attn -[OKAY] - transformer.......stochastic_transformer............ [OKAY]............[NO] - . [NO] ....... [OKAY].......[NO] [OKAY]....... - - [OKAY]transformer stochastic_transformer............ -. [NO][NO] ....... .......[OKAY] -[OKAY] -stochastic_transformer . [NO] ....... [OKAY] - > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY]utils - .................. [YES] quantizer...... ..............[OKAY] -[NO] ....... [OKAY] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja --------------------------------------------------- - - -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... [OKAY][OKAY].................. - - [OKAY]----------------------------------------------------------------------------------------------------[OKAY] - - - ---------------------------------------------------op nameop name - --------------------------------------------------................op name................ - ................installedop nameinstalled ..installed.................. compatiblecompatible .. - -installed --------------------------------------------------compatible - -.. --------------------------------------------------compatible - --------------------------------------------------- --------------------------------------------------- -cpu_adam cpu_adam............... [YES]cpu_adam............... ......[YES] ............... [OKAY] ...... -[YES] [OKAY]cpu_adam -...... [OKAY] -...............fused_adam [YES].............fused_adam [NO]fused_adam .................... [OKAY][NO]...... - .................... fused_lamb[NO][OKAY] -....................[OKAY] fused_lamb[NO][OKAY] - -.................... fused_lamb[OKAY][NO] - .................... [NO][OKAY] -....... [OKAY] -fused_adam ............. [NO] .......sparse_attn ............ sparse_attn[NO]sparse_attn ............................... [OKAY] [NO] -[NO][OKAY] .......transformer -...................[OKAY] -fused_lamb[OKAY] [NO]transformer -.................... ............[NO][OKAY]transformer - .......[NO]............ .......stochastic_transformer[NO] [OKAY]........[OKAY] - - [OKAY][NO] - stochastic_transformer....... stochastic_transformer[OKAY] -. .[NO] [NO]....... .......[OKAY] -[OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja .................. ...................................................... [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -op nameop name op nameop name ................ ................ ................................installedinstalled installed..installed.. .... compatible compatible -compatiblecompatible - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -cpu_adamcpu_adamcpu_adam ...............cpu_adam.............................. [YES] ...............[YES] [YES] ...... ......[YES] ...... [OKAY]......[OKAY] - - [OKAY][OKAY] - -fused_adam .............fused_adam fused_adam fused_adam[NO] ............. ............. .......[NO] ............. [NO][OKAY] ....... - [NO].......[OKAY] -fused_lamb[OKAY]....... - .............[OKAY]fused_lamb - [NO]fused_lamb............. fused_lamb ....... .............[NO] ............. [OKAY][NO]....... -[NO] ....... ....... [OKAY] [OKAY] - -[OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attnsparse_attn transformersparse_attn............ ............[NO]........................ [NO] .......[NO][NO] [OKAY] -..................... [OKAY]transformer[OKAY] -[OKAY] - ............ -transformer [NO]stochastic_transformer ............transformer....... . [NO] ............ [OKAY] [NO][NO] - ....... ....... ....... [OKAY] stochastic_transformer -[OKAY][OKAY] - -stochastic_transformer. [NO]stochastic_transformer . ....... [OKAY].[NO] - [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... utils[OKAY] -.................. [YES] ...... [OKAY]utils - .................. [YES] quantizer...... [OKAY].............. -[NO] ....... quantizer[OKAY] -.............. [NO] .......-------------------------------------------------- - [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. .............. [NO] - ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils utils.................. ..................[YES] [YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -JIT compiled ops requires ninjaJIT compiled ops requires ninja --------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................. ...................................................... [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - - op name................ op nameop name................ ................installed................installed installedinstalled.. .... .. compatiblecompatible compatible - -compatible ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam ............... cpu_adam .............................. [YES] [YES][YES] ..................... ............ [OKAY] [YES] -[OKAY][OKAY] - -...... [OKAY] -fused_adam fused_adamfused_adam............. .............fused_adam .............[NO] ............. [NO] [NO].......[NO]....... [OKAY].......[OKAY]....... - - [OKAY][OKAY] -fused_lamb -fused_lamb fused_lamb ............. .............fused_lamb .............[NO][NO] ............. [NO] ..............[NO] [OKAY] [OKAY].............. - - [OKAY][OKAY] - -sparse_attnsparse_attnsparse_attn ............ sparse_attn........................ ............ [NO] [NO][NO][NO]....... ..............[OKAY]....... - [OKAY][OKAY][OKAY] - -transformer -transformer transformer ............ ............transformer............ [NO][NO]............[NO] ....... .......[NO] .......[OKAY] -.......[OKAY][OKAY] - -[OKAY]stochastic_transformer - stochastic_transformerstochastic_transformer . stochastic_transformer. [NO][NO]. . ....... ....... [NO] [NO][OKAY] -[OKAY] -.............. [OKAY][OKAY] - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.transformer_inference .. -[NO] ....... [OKAY] -utils .................. [YES]async_io ...... ...............[OKAY] -async_io[NO] ...................... quantizer[NO][NO] - ..................... [NO][NO] -....... [OKAY] ---------------------------------------------------transformer_inference - .. [NO] ....... [OKAY]transformer_inference - .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY] -utils .................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 -deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed general environment info: -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info:deepspeed wheel compiled w. ...... -torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... utils[OKAY] -.................. utils .................. [YES] ...... [OKAY] -quantizer[YES] .................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] transformer_inference....... ..[NO] -[NO] ....... [OKAY] -utils .................. transformer_inference[YES] ........ [OKAY][NO] - ....... [OKAY] -quantizer .............. [NO] utils....... ..................[OKAY] -[YES] ......-------------------------------------------------- -[OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1 -torch cuda version ...............torch cuda version ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name op nameop name ................ ................................ ................ installedinstalled installed installed ...... ..compatiblecompatiblecompatible - - -compatible------------------------------------------------------------------------------------------------------------------------------------------------------ - - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam ...............cpu_adam ...............[YES]............... .....................[YES][YES] [OKAY]......[YES]...... - [OKAY] ...... -[OKAY] -[OKAY] -fused_adam ............. fused_adam[NO] fused_adamfused_adam............. ....... ............. ............. [NO][OKAY] -[NO][NO]....... fused_lamb.......[OKAY]....... -............. [OKAY] [OKAY][NO] -fused_lamb - .................... fused_lamb fused_lamb[OKAY][NO] -............. ............. ....... [NO] [NO] [OKAY] ....... -....... [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............transformer [NO]............ sparse_attnsparse_attn ....... [NO] ............ ............[OKAY] ....... - [NO] [NO] [OKAY] transformer -.............. ............[OKAY][OKAY] [NO]stochastic_transformer - - ....... [OKAY]. -transformer transformer [NO] ............stochastic_transformer ................... [NO][OKAY][NO] - . ....... ....... [NO] [OKAY] [OKAY]....... - - [OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info:deepspeed wheel compiled w. ...... -torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -> setting codecarbon ... -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] -[OKAY] -----------------------------------------------------------------------------------------------------[OKAY] - - - -op nameop name---------------------------------------------------------------------------------------------------- - -................................op name op nameinstalled installed .................................... installedinstalledcompatiblecompatible -.. -..-------------------------------------------------- -------------------------------------------------- - compatible -compatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] .....................cpu_adam cpu_adam [YES][OKAY]............... - ......[YES]............... [OKAY]......[YES] - [OKAY]...... -fused_adam [OKAY]............. -fused_adam[NO] .................... fused_adam[NO][OKAY] -.................... fused_adamfused_lamb [OKAY] [NO]............. - .......[NO]............. fused_lamb ....... [OKAY] .............[NO][OKAY] - - [NO] .......fused_lamb....... [OKAY][OKAY]............. - - [NO] ....... sparse_attn[OKAY]fused_lamb - ......................... [NO] [NO]sparse_attn....... ............[OKAY]....... -sparse_attn [NO] transformer[OKAY] ........................ -....... [NO][NO][OKAY] -....... .......[OKAY]transformer - [OKAY]............sparse_attn stochastic_transformer - [NO] ............ . .......transformer [NO][NO]............ [OKAY] -.......[NO]....... [OKAY]stochastic_transformer[OKAY] - ....... - .[OKAY] transformer -[NO] ...................stochastic_transformer [OKAY] [NO] - ........ [NO][OKAY] -....... [OKAY]stochastic_transformer - . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils .................. [YES]utils ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... torch version1.8.1 -.................... torch cuda version 1.8.1............... - 11.1torch cuda version - nvcc version............... .....................11.1 -11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] ........... - deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ------------------------------------------------------------------------------------------------------------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] --------------------------------------------------- - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - -op name op name................ ................op name ................ installedinstalled................ installed .. installed ....compatible -compatible..compatible-------------------------------------------------- - --------------------------------------------------- -compatible --------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam ...............cpu_adam............... cpu_adam[YES] ............... [YES]...............[YES]...... ...... [OKAY][YES] ...... - [OKAY]......[OKAY] - - [OKAY] -fused_adam ............. [NO] fused_adamfused_adam.......fused_adam ............. ..........................[OKAY] [NO] - [NO][NO]....... fused_lamb..............[OKAY] - .............[OKAY] -[OKAY]fused_lamb[NO] - .............fused_lamb....... [NO].............[OKAY]fused_lamb - .......[NO]............. .......[NO][OKAY] [OKAY] -....... - [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attnsparse_attn transformer sparse_attn........................ [NO] ........................ [NO] ....... [NO][NO] ....... [OKAY] ....... - .......[OKAY][OKAY] - -transformer[OKAY] -............transformer stochastic_transformertransformer [NO] ............ ....................[NO] [NO] [NO][OKAY].............. - .......[OKAY][OKAY] - -stochastic_transformer[OKAY] -stochastic_transformer. stochastic_transformer[NO]. .......[NO] . [OKAY] ....... -[NO] [OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... utils[OKAY] -.................. [YES] ...... [OKAY]utils - .................. quantizer[YES] .................... [NO][OKAY] -....... [OKAY] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1DeepSpeed general environment info: -nvcc version ..................... - 11.2 -deepspeed install path ...........torch install path ...............['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed wheel compiled w. - ...... torch versiontorch 1.8, cuda 11.1 -.................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -1.8.1 -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc versiontorch cuda version .................................... 11.211.1 - -deepspeed install path nvcc version........... ..................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.2 - -deepspeed infodeepspeed install path .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO]transformer_inference ....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ...............torch version 11.1.................... - nvcc version1.8.1 -..................... 11.2torch cuda version - deepspeed install path............... ...........11.1 -nvcc version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] ..................... - deepspeed info11.2 -...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...... - deepspeed infotorch 1.8, cuda 11.1 -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -quantizerutils ................................ [NO][YES] ............. [OKAY][OKAY] - ---------------------------------------------------quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path -............... torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] 1.8.1 - -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc versiontorch cuda version .................................... 11.211.1 - -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 - 1.8.1 -torch cuda version ...............torch cuda version 11.1............... - nvcc version11.1 -.....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -................... deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................ ................................................installed installedinstalledinstalled.. ....compatible.. - -------------------------------------------------- compatiblecompatiblecompatible - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adam ............... [YES] cpu_adam......cpu_adam cpu_adam ...............[OKAY] ............... -............... [YES][YES][YES] ...... ...... ...... [OKAY]fused_adam [OKAY] - [OKAY] -............. - [NO] ....... [OKAY] -fused_adam fused_lamb............. fused_adam............. fused_adam............. [NO][NO] ............. [NO].............. .......[NO][OKAY][OKAY] -[OKAY] -....... -fused_lamb [OKAY]............. - fused_lamb[NO] fused_lamb ............. .......[NO].............sparse_attn ....... [NO] [OKAY]............ [OKAY] -[NO] -....... .......[OKAY] -[OKAY] -transformer ............sparse_attn [NO] sparse_attn................... ............sparse_attn [NO][OKAY] [NO] ....... -............ .......[OKAY]stochastic_transformer [NO][OKAY] - - .......transformer.transformer [NO][OKAY]........................ - ....... [OKAY] -[NO]transformer[NO] .......................... [OKAY][NO][OKAY] - -....... [OKAY] -stochastic_transformerstochastic_transformer ..stochastic_transformer [NO][NO] ........ [NO].......[OKAY] -.......[OKAY] -[OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - - ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name -op name - op name ................op name ................................ installed installed installed ...................... compatibleinstalledcompatible -compatible -.. ----------------------------------------------------------------------------------------------------- -------------------------------------------------- - - -compatible --------------------------------------------------- -cpu_adamcpu_adam ...............cpu_adam cpu_adam[YES]............... ............... ...... ...............[YES][OKAY][YES] - [YES]............ ......[OKAY][OKAY] - -[OKAY] -fused_adam ............. [NO] ....... [OKAY]fused_adam - fused_adam.............fused_adam fused_lamb .............[NO]............. .................... [NO][OKAY] -[NO] [NO].............. fused_lamb [OKAY].......[OKAY] -[OKAY]............. - - fused_lamb[NO]fused_lamb ................................. [NO][OKAY][NO] - ..............sparse_attn [OKAY][OKAY]............ - - [NO] ....... sparse_attn[OKAY] -............ [NO] transformer....... sparse_attn............[OKAY]sparse_attn -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -............ [NO] ............ transformer[NO] .......[NO]....... ............[OKAY]....... [OKAY][OKAY] - - -[NO] .......transformertransformer stochastic_transformer [OKAY]............ -............ . [NO] [NO][NO].......stochastic_transformer ....... ....... .[OKAY] -[OKAY] -[OKAY][NO]stochastic_transformer - ....... [OKAY]. -stochastic_transformer [NO] ........ [OKAY][NO] - ....... [OKAY] ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY] - [OKAY] -[OKAY]-------------------------------------------------- - --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name -op name - op name................................op name installed................installed................ .. ..installed installedcompatiblecompatible - -....---------------------------------------------------------------------------------------------------- - -compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... ...............[YES] [YES]cpu_adamcpu_adam...... ...... [OKAY]..............................[OKAY] - -[YES][YES] ............ [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO][NO] fused_adam..............fused_adam [OKAY]..........................[OKAY] - -[NO][NO] ..............fused_lamb fused_lamb[OKAY][OKAY]............. - - .............[NO] fused_lamb.......[NO]fused_lamb ....... .............[OKAY] .............[OKAY] - -[NO][NO] .............. [OKAY][OKAY] - -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY]sparse_attn -sparse_attn - ............transformer............ transformer[NO]............[NO] [NO] ............ .............. ....... [OKAY][NO] [OKAY] - [OKAY] -....... -transformer [OKAY]transformer............ -stochastic_transformer ............[NO] .[NO] .......stochastic_transformer [NO] ....... [OKAY]........ - [OKAY][OKAY] -[NO]stochastic_transformer - ....... [OKAY].stochastic_transformer - [NO]. .......[NO] [OKAY]....... - [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY]utils - .................. [YES] ......quantizer [OKAY].............. - [NO] .......quantizer [OKAY].............. - [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... [OKAY].................. - [YES] ...... [OKAY] -utils quantizer.................. ..............[YES] [NO]...... .......[OKAY] -[OKAY] ---------------------------------------------------quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ...... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - .....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -> initializing torch distributed ... -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------JIT compiled ops requires ninja - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] - -[OKAY] --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - - op name op nameop name................ ................ installed................ ................installed .. installed ..installed..compatible -compatiblecompatible..-------------------------------------------------- - - - --------------------------------------------------compatible-------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... [YES] ......cpu_adam cpu_adamcpu_adam [OKAY] ............... -.............................. [YES][YES] [YES] ...... ...... fused_adam...... [OKAY] -[OKAY].............[OKAY] - -[NO] ....... [OKAY] -fused_adam ............. fused_lamb[NO]fused_adam fused_adam ............. .................... ............. [NO] [OKAY].......[NO][NO] [OKAY] - ....... -fused_lamb....... [OKAY]............. [OKAY] -[NO] - .......fused_lamb [OKAY]fused_lambsparse_attn -............. .........................[NO] [NO][NO]....... ..............[OKAY] -sparse_attn[OKAY][OKAY] - -............ transformer[NO] ................... [NO][OKAY] -....... sparse_attntransformer[OKAY] -............sparse_attn ............ [NO] ............ stochastic_transformer[NO]....... [NO] [OKAY] .............. - . [OKAY][OKAY]stochastic_transformer -[NO] - transformer........ [OKAY]transformer[NO]............ - ...................[NO] [NO] [OKAY] ....... -....... [OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -> setting tensorboard ... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> initializing tensor model parallel with size 4 -> initializing pipeline model parallel with size 8 -> setting random seeds to 42 ... -[2021-09-24 02:11:21,021] [INFO] [checkpointing.py:226:model_parallel_cuda_manual_seed] > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 2760 and data parallel seed: 42 -> compiling dataset index builder ... -make: Entering directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' -make: Nothing to be done for 'default'. -make: Leaving directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' ->>> done with dataset index builder. Compilation time: 0.307 seconds -> compiling and loading fused kernels ... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_upper_triang_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_upper_triang_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module fused_mix_prec_layer_norm_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module fused_mix_prec_layer_norm_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( ->>> done with compiling and loading fused kernels. Compilation time: 27.029 seconds -time to initialize megatron (seconds): 68.551 -[after megatron is initialized] datetime: 2021-09-24 02:11:48 -building GPT model ... -[2021-09-24 02:11:48,760] [INFO] [utils.py:680:see_memory_usage] Before Building Model -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:373: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved - warnings.warn( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:381: FutureWarning: torch.cuda.max_memory_cached has been renamed to torch.cuda.max_memory_reserved - warnings.warn( -[2021-09-24 02:11:48,762] [INFO] [utils.py:681:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB -[2021-09-24 02:11:48,763] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 37.77 GB, percent = 20.2% -SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None -Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=1, data=0, model=0): 32, ProcessCoord(pipe=1, data=0, model=1): 33, ProcessCoord(pipe=1, data=0, model=2): 34, ProcessCoord(pipe=1, data=0, model=3): 35, ProcessCoord(pipe=1, data=1, model=0): 36, ProcessCoord(pipe=1, data=1, model=1): 37, ProcessCoord(pipe=1, data=1, model=2): 38, ProcessCoord(pipe=1, data=1, model=3): 39, ProcessCoord(pipe=1, data=2, model=0): 40, ProcessCoord(pipe=1, data=2, model=1): 41, ProcessCoord(pipe=1, data=2, model=2): 42, ProcessCoord(pipe=1, data=2, model=3): 43, ProcessCoord(pipe=1, data=3, model=0): 44, ProcessCoord(pipe=1, data=3, model=1): 45, ProcessCoord(pipe=1, data=3, model=2): 46, ProcessCoord(pipe=1, data=3, model=3): 47, ProcessCoord(pipe=1, data=4, model=0): 48, ProcessCoord(pipe=1, data=4, model=1): 49, ProcessCoord(pipe=1, data=4, model=2): 50, ProcessCoord(pipe=1, data=4, model=3): 51, ProcessCoord(pipe=1, data=5, model=0): 52, ProcessCoord(pipe=1, data=5, model=1): 53, ProcessCoord(pipe=1, data=5, model=2): 54, ProcessCoord(pipe=1, data=5, model=3): 55, ProcessCoord(pipe=1, data=6, model=0): 56, ProcessCoord(pipe=1, data=6, model=1): 57, ProcessCoord(pipe=1, data=6, model=2): 58, ProcessCoord(pipe=1, data=6, model=3): 59, ProcessCoord(pipe=1, data=7, model=0): 60, ProcessCoord(pipe=1, data=7, model=1): 61, ProcessCoord(pipe=1, data=7, model=2): 62, ProcessCoord(pipe=1, data=7, model=3): 63, ProcessCoord(pipe=2, data=0, model=0): 64, ProcessCoord(pipe=2, data=0, model=1): 65, ProcessCoord(pipe=2, data=0, model=2): 66, ProcessCoord(pipe=2, data=0, model=3): 67, ProcessCoord(pipe=2, data=1, model=0): 68, ProcessCoord(pipe=2, data=1, model=1): 69, ProcessCoord(pipe=2, data=1, model=2): 70, ProcessCoord(pipe=2, data=1, model=3): 71, ProcessCoord(pipe=2, data=2, model=0): 72, ProcessCoord(pipe=2, data=2, model=1): 73, ProcessCoord(pipe=2, data=2, model=2): 74, ProcessCoord(pipe=2, data=2, model=3): 75, ProcessCoord(pipe=2, data=3, model=0): 76, ProcessCoord(pipe=2, data=3, model=1): 77, ProcessCoord(pipe=2, data=3, model=2): 78, ProcessCoord(pipe=2, data=3, model=3): 79, ProcessCoord(pipe=2, data=4, model=0): 80, ProcessCoord(pipe=2, data=4, model=1): 81, ProcessCoord(pipe=2, data=4, model=2): 82, ProcessCoord(pipe=2, data=4, model=3): 83, ProcessCoord(pipe=2, data=5, model=0): 84, ProcessCoord(pipe=2, data=5, model=1): 85, ProcessCoord(pipe=2, data=5, model=2): 86, ProcessCoord(pipe=2, data=5, model=3): 87, ProcessCoord(pipe=2, data=6, model=0): 88, ProcessCoord(pipe=2, data=6, model=1): 89, ProcessCoord(pipe=2, data=6, model=2): 90, ProcessCoord(pipe=2, data=6, model=3): 91, ProcessCoord(pipe=2, data=7, model=0): 92, ProcessCoord(pipe=2, data=7, model=1): 93, ProcessCoord(pipe=2, data=7, model=2): 94, ProcessCoord(pipe=2, data=7, model=3): 95, ProcessCoord(pipe=3, data=0, model=0): 96, ProcessCoord(pipe=3, data=0, model=1): 97, ProcessCoord(pipe=3, data=0, model=2): 98, ProcessCoord(pipe=3, data=0, model=3): 99, ProcessCoord(pipe=3, data=1, model=0): 100, ProcessCoord(pipe=3, data=1, model=1): 101, ProcessCoord(pipe=3, data=1, model=2): 102, ProcessCoord(pipe=3, data=1, model=3): 103, ProcessCoord(pipe=3, data=2, model=0): 104, ProcessCoord(pipe=3, data=2, model=1): 105, ProcessCoord(pipe=3, data=2, model=2): 106, ProcessCoord(pipe=3, data=2, model=3): 107, ProcessCoord(pipe=3, data=3, model=0): 108, ProcessCoord(pipe=3, data=3, model=1): 109, ProcessCoord(pipe=3, data=3, model=2): 110, ProcessCoord(pipe=3, data=3, model=3): 111, ProcessCoord(pipe=3, data=4, model=0): 112, ProcessCoord(pipe=3, data=4, model=1): 113, ProcessCoord(pipe=3, data=4, model=2): 114, ProcessCoord(pipe=3, data=4, model=3): 115, ProcessCoord(pipe=3, data=5, model=0): 116, ProcessCoord(pipe=3, data=5, model=1): 117, ProcessCoord(pipe=3, data=5, model=2): 118, ProcessCoord(pipe=3, data=5, model=3): 119, ProcessCoord(pipe=3, data=6, model=0): 120, ProcessCoord(pipe=3, data=6, model=1): 121, ProcessCoord(pipe=3, data=6, model=2): 122, ProcessCoord(pipe=3, data=6, model=3): 123, ProcessCoord(pipe=3, data=7, model=0): 124, ProcessCoord(pipe=3, data=7, model=1): 125, ProcessCoord(pipe=3, data=7, model=2): 126, ProcessCoord(pipe=3, data=7, model=3): 127, ProcessCoord(pipe=4, data=0, model=0): 128, ProcessCoord(pipe=4, data=0, model=1): 129, ProcessCoord(pipe=4, data=0, model=2): 130, ProcessCoord(pipe=4, data=0, model=3): 131, ProcessCoord(pipe=4, data=1, model=0): 132, ProcessCoord(pipe=4, data=1, model=1): 133, ProcessCoord(pipe=4, data=1, model=2): 134, ProcessCoord(pipe=4, data=1, model=3): 135, ProcessCoord(pipe=4, data=2, model=0): 136, ProcessCoord(pipe=4, data=2, model=1): 137, ProcessCoord(pipe=4, data=2, model=2): 138, ProcessCoord(pipe=4, data=2, model=3): 139, ProcessCoord(pipe=4, data=3, model=0): 140, ProcessCoord(pipe=4, data=3, model=1): 141, ProcessCoord(pipe=4, data=3, model=2): 142, ProcessCoord(pipe=4, data=3, model=3): 143, ProcessCoord(pipe=4, data=4, model=0): 144, ProcessCoord(pipe=4, data=4, model=1): 145, ProcessCoord(pipe=4, data=4, model=2): 146, ProcessCoord(pipe=4, data=4, model=3): 147, ProcessCoord(pipe=4, data=5, model=0): 148, ProcessCoord(pipe=4, data=5, model=1): 149, ProcessCoord(pipe=4, data=5, model=2): 150, ProcessCoord(pipe=4, data=5, model=3): 151, ProcessCoord(pipe=4, data=6, model=0): 152, ProcessCoord(pipe=4, data=6, model=1): 153, ProcessCoord(pipe=4, data=6, model=2): 154, ProcessCoord(pipe=4, data=6, model=3): 155, ProcessCoord(pipe=4, data=7, model=0): 156, ProcessCoord(pipe=4, data=7, model=1): 157, ProcessCoord(pipe=4, data=7, model=2): 158, ProcessCoord(pipe=4, data=7, model=3): 159, ProcessCoord(pipe=5, data=0, model=0): 160, ProcessCoord(pipe=5, data=0, model=1): 161, ProcessCoord(pipe=5, data=0, model=2): 162, ProcessCoord(pipe=5, data=0, model=3): 163, ProcessCoord(pipe=5, data=1, model=0): 164, ProcessCoord(pipe=5, data=1, model=1): 165, ProcessCoord(pipe=5, data=1, model=2): 166, ProcessCoord(pipe=5, data=1, model=3): 167, ProcessCoord(pipe=5, data=2, model=0): 168, ProcessCoord(pipe=5, data=2, model=1): 169, ProcessCoord(pipe=5, data=2, model=2): 170, ProcessCoord(pipe=5, data=2, model=3): 171, ProcessCoord(pipe=5, data=3, model=0): 172, ProcessCoord(pipe=5, data=3, model=1): 173, ProcessCoord(pipe=5, data=3, model=2): 174, ProcessCoord(pipe=5, data=3, model=3): 175, ProcessCoord(pipe=5, data=4, model=0): 176, ProcessCoord(pipe=5, data=4, model=1): 177, ProcessCoord(pipe=5, data=4, model=2): 178, ProcessCoord(pipe=5, data=4, model=3): 179, ProcessCoord(pipe=5, data=5, model=0): 180, ProcessCoord(pipe=5, data=5, model=1): 181, ProcessCoord(pipe=5, data=5, model=2): 182, ProcessCoord(pipe=5, data=5, model=3): 183, ProcessCoord(pipe=5, data=6, model=0): 184, ProcessCoord(pipe=5, data=6, model=1): 185, ProcessCoord(pipe=5, data=6, model=2): 186, ProcessCoord(pipe=5, data=6, model=3): 187, ProcessCoord(pipe=5, data=7, model=0): 188, ProcessCoord(pipe=5, data=7, model=1): 189, ProcessCoord(pipe=5, data=7, model=2): 190, ProcessCoord(pipe=5, data=7, model=3): 191, ProcessCoord(pipe=6, data=0, model=0): 192, ProcessCoord(pipe=6, data=0, model=1): 193, ProcessCoord(pipe=6, data=0, model=2): 194, ProcessCoord(pipe=6, data=0, model=3): 195, ProcessCoord(pipe=6, data=1, model=0): 196, ProcessCoord(pipe=6, data=1, model=1): 197, ProcessCoord(pipe=6, data=1, model=2): 198, ProcessCoord(pipe=6, data=1, model=3): 199, ProcessCoord(pipe=6, data=2, model=0): 200, ProcessCoord(pipe=6, data=2, model=1): 201, ProcessCoord(pipe=6, data=2, model=2): 202, ProcessCoord(pipe=6, data=2, model=3): 203, ProcessCoord(pipe=6, data=3, model=0): 204, ProcessCoord(pipe=6, data=3, model=1): 205, ProcessCoord(pipe=6, data=3, model=2): 206, ProcessCoord(pipe=6, data=3, model=3): 207, ProcessCoord(pipe=6, data=4, model=0): 208, ProcessCoord(pipe=6, data=4, model=1): 209, ProcessCoord(pipe=6, data=4, model=2): 210, ProcessCoord(pipe=6, data=4, model=3): 211, ProcessCoord(pipe=6, data=5, model=0): 212, ProcessCoord(pipe=6, data=5, model=1): 213, ProcessCoord(pipe=6, data=5, model=2): 214, ProcessCoord(pipe=6, data=5, model=3): 215, ProcessCoord(pipe=6, data=6, model=0): 216, ProcessCoord(pipe=6, data=6, model=1): 217, ProcessCoord(pipe=6, data=6, model=2): 218, ProcessCoord(pipe=6, data=6, model=3): 219, ProcessCoord(pipe=6, data=7, model=0): 220, ProcessCoord(pipe=6, data=7, model=1): 221, ProcessCoord(pipe=6, data=7, model=2): 222, ProcessCoord(pipe=6, data=7, model=3): 223, ProcessCoord(pipe=7, data=0, model=0): 224, ProcessCoord(pipe=7, data=0, model=1): 225, ProcessCoord(pipe=7, data=0, model=2): 226, ProcessCoord(pipe=7, data=0, model=3): 227, ProcessCoord(pipe=7, data=1, model=0): 228, ProcessCoord(pipe=7, data=1, model=1): 229, ProcessCoord(pipe=7, data=1, model=2): 230, ProcessCoord(pipe=7, data=1, model=3): 231, ProcessCoord(pipe=7, data=2, model=0): 232, ProcessCoord(pipe=7, data=2, model=1): 233, ProcessCoord(pipe=7, data=2, model=2): 234, ProcessCoord(pipe=7, data=2, model=3): 235, ProcessCoord(pipe=7, data=3, model=0): 236, ProcessCoord(pipe=7, data=3, model=1): 237, ProcessCoord(pipe=7, data=3, model=2): 238, ProcessCoord(pipe=7, data=3, model=3): 239, ProcessCoord(pipe=7, data=4, model=0): 240, ProcessCoord(pipe=7, data=4, model=1): 241, ProcessCoord(pipe=7, data=4, model=2): 242, ProcessCoord(pipe=7, data=4, model=3): 243, ProcessCoord(pipe=7, data=5, model=0): 244, ProcessCoord(pipe=7, data=5, model=1): 245, ProcessCoord(pipe=7, data=5, model=2): 246, ProcessCoord(pipe=7, data=5, model=3): 247, ProcessCoord(pipe=7, data=6, model=0): 248, ProcessCoord(pipe=7, data=6, model=1): 249, ProcessCoord(pipe=7, data=6, model=2): 250, ProcessCoord(pipe=7, data=6, model=3): 251, ProcessCoord(pipe=7, data=7, model=0): 252, ProcessCoord(pipe=7, data=7, model=1): 253, ProcessCoord(pipe=7, data=7, model=2): 254, ProcessCoord(pipe=7, data=7, model=3): 255} -[2021-09-24 02:11:50,155] [INFO] [module.py:360:_partition_layers] Partitioning pipeline stages with method type:transformer -stage=0 layers=7 - 0: _to_float16 - 1: EmbeddingPipe - 2: - 3: ParallelTransformerLayerPipe - 4: ParallelTransformerLayerPipe - 5: ParallelTransformerLayerPipe - 6: ParallelTransformerLayerPipe -stage=1 layers=4 - 7: ParallelTransformerLayerPipe - 8: ParallelTransformerLayerPipe - 9: ParallelTransformerLayerPipe - 10: ParallelTransformerLayerPipe -stage=2 layers=4 - 11: ParallelTransformerLayerPipe - 12: ParallelTransformerLayerPipe - 13: ParallelTransformerLayerPipe - 14: ParallelTransformerLayerPipe -stage=3 layers=4 - 15: ParallelTransformerLayerPipe - 16: ParallelTransformerLayerPipe - 17: ParallelTransformerLayerPipe - 18: ParallelTransformerLayerPipe -stage=4 layers=4 - 19: ParallelTransformerLayerPipe - 20: ParallelTransformerLayerPipe - 21: ParallelTransformerLayerPipe - 22: ParallelTransformerLayerPipe -stage=5 layers=4 - 23: ParallelTransformerLayerPipe - 24: ParallelTransformerLayerPipe - 25: ParallelTransformerLayerPipe - 26: ParallelTransformerLayerPipe -stage=6 layers=4 - 27: ParallelTransformerLayerPipe - 28: ParallelTransformerLayerPipe - 29: ParallelTransformerLayerPipe - 30: ParallelTransformerLayerPipe -stage=7 layers=8 - 31: ParallelTransformerLayerPipe - 32: ParallelTransformerLayerPipe - 33: ParallelTransformerLayerPipe - 34: ParallelTransformerLayerPipe - 35: - 36: MixedFusedLayerNorm - 37: EmbeddingPipe - 38: float16_to_fp32 - loss: CrossEntropy - > number of parameters on (tensor, pipeline) model parallel rank (0, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (1, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (0, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (3, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (2, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (1, 0): 1986465792 -[2021-09-24 02:11:51,439] [INFO] [utils.py:680:see_memory_usage] After Building Model -[2021-09-24 02:11:51,440] [INFO] [utils.py:681:see_memory_usage] MA 3.77 GB Max_MA 3.79 GB CA 3.79 GB Max_CA 4 GB -[2021-09-24 02:11:51,441] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 37.96 GB, percent = 20.3% - > number of parameters on (tensor, pipeline) model parallel rank (0, 0): 1986465792 -setting training iterations to 159576 -> learning rate decay style: cosine -DeepSpeed is enabled. -[2021-09-24 02:11:51,495] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.4.2+bc17042, git-hash=bc17042, git-branch=big-science -[2021-09-24 02:11:51,606] [INFO] [engine.py:179:__init__] DeepSpeed Flops Profiler Enabled: False -[2021-09-24 02:11:51,606] [INFO] [engine.py:736:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer -[2021-09-24 02:11:51,606] [INFO] [engine.py:741:_configure_optimizer] Using client Optimizer as basic optimizer -[2021-09-24 02:11:51,606] [INFO] [engine.py:750:_configure_optimizer] DeepSpeed Basic Optimizer = FusedAdam -[2021-09-24 02:11:51,607] [INFO] [utils.py:43:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= -[2021-09-24 02:11:51,607] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 1 optimizer -[2021-09-24 02:11:51,607] [INFO] [stage2.py:106:__init__] Reduce bucket size 500000000 -[2021-09-24 02:11:51,607] [INFO] [stage2.py:107:__init__] Allgather bucket size 500000000 -[2021-09-24 02:11:51,607] [INFO] [stage2.py:108:__init__] CPU Offload: False -[2021-09-24 02:11:51,607] [INFO] [stage2.py:109:__init__] Round robin gradient partitioning: False -[2021-09-24 02:11:56,299] [INFO] [stage2.py:419:__init__] optimizer state initialized -[2021-09-24 02:11:56,299] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam -[2021-09-24 02:11:56,299] [INFO] [engine.py:553:_configure_lr_scheduler] DeepSpeed using client LR scheduler -[2021-09-24 02:11:56,299] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = -[2021-09-24 02:11:56,300] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999)] -[2021-09-24 02:11:56,300] [INFO] [config.py:900:print] DeepSpeedEngine configuration: -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] activation_checkpointing_config { - "partition_activations": false, - "contiguous_memory_optimization": false, - "cpu_checkpointing": false, - "number_checkpoints": null, - "synchronize_checkpoint_boundary": false, - "profile": false -} -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] allreduce_always_fp32 ........ False -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] amp_enabled .................. False -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] amp_params ................... False -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] checkpoint_tag_validation_enabled True -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] checkpoint_tag_validation_fail False -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] disable_allgather ............ False -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] dump_state ................... False -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 500, 'delayed_shift': 2, 'min_scale': 1} -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] eigenvalue_enabled ........... False -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] eigenvalue_gas_boundary_resolution 1 -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] eigenvalue_layer_name ........ bert.encoder.layer -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] eigenvalue_layer_num ......... 0 -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] eigenvalue_max_iter .......... 100 -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] eigenvalue_stability ......... 1e-06 -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] eigenvalue_tol ............... 0.01 -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] eigenvalue_verbose ........... False -[2021-09-24 02:11:56,300] [INFO] [config.py:904:print] elasticity_enabled ........... False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] flops_profiler_config ........ { - "enabled": false, - "profile_step": 1, - "module_depth": -1, - "top_modules": 1, - "detailed": true, - "output_file": null -} -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] fp16_enabled ................. True -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] fp16_mixed_quantize .......... False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] global_rank .................. 0 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] gradient_accumulation_steps .. 256 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] gradient_clipping ............ 1.0 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] gradient_predivide_factor .... 1.0 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] initial_dynamic_scale ........ 4096 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] loss_scale ................... 0 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] memory_breakdown ............. False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] optimizer_legacy_fusion ...... False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] optimizer_name ............... None -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] optimizer_params ............. None -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] pld_enabled .................. False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] pld_params ................... False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] prescale_gradients ........... False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_change_rate ......... 0.001 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_groups .............. 1 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_offset .............. 1000 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_period .............. 1000 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_rounding ............ 0 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_start_bits .......... 16 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_target_bits ......... 8 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_training_enabled .... False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_type ................ 0 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] quantize_verbose ............. False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] scheduler_name ............... None -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] scheduler_params ............. None -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] sparse_attention ............. None -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] sparse_gradients_enabled ..... False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] steps_per_print .............. 2000 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] tensorboard_enabled .......... False -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] tensorboard_job_name ......... DeepSpeedJobName -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] tensorboard_output_path ...... -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] train_batch_size ............. 2048 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] train_micro_batch_size_per_gpu 1 -[2021-09-24 02:11:56,301] [INFO] [config.py:904:print] use_quantizer_kernel ......... False -[2021-09-24 02:11:56,302] [INFO] [config.py:904:print] wall_clock_breakdown ......... False -[2021-09-24 02:11:56,302] [INFO] [config.py:904:print] world_size ................... 8 -[2021-09-24 02:11:56,302] [INFO] [config.py:904:print] zero_allow_untested_optimizer False -[2021-09-24 02:11:56,302] [INFO] [config.py:904:print] zero_config .................. { - "stage": 1, - "contiguous_gradients": false, - "reduce_scatter": true, - "reduce_bucket_size": 5.000000e+08, - "allgather_partitions": true, - "allgather_bucket_size": 5.000000e+08, - "overlap_comm": false, - "load_from_fp32_weights": true, - "elastic_checkpoint": true, - "offload_param": null, - "offload_optimizer": null, - "sub_group_size": 1.000000e+09, - "prefetch_bucket_size": 5.000000e+07, - "param_persistence_threshold": 1.000000e+05, - "max_live_parameters": 1.000000e+09, - "max_reuse_distance": 1.000000e+09, - "gather_fp16_weights_on_model_save": false, - "ignore_unused_parameters": true, - "round_robin_gradients": false, - "legacy_stage1": false -} -[2021-09-24 02:11:56,302] [INFO] [config.py:904:print] zero_enabled ................. True -[2021-09-24 02:11:56,302] [INFO] [config.py:904:print] zero_optimization_stage ...... 1 -[2021-09-24 02:11:56,302] [INFO] [config.py:906:print] json = { - "train_micro_batch_size_per_gpu": 1, - "train_batch_size": 2.048000e+03, - "gradient_clipping": 1.0, - "zero_optimization": { - "stage": 1 - }, - "fp16": { - "enabled": true, - "loss_scale": 0, - "loss_scale_window": 500, - "hysteresis": 2, - "min_loss_scale": 1, - "initial_scale_power": 12 - }, - "steps_per_print": 2.000000e+03, - "wall_clock_breakdown": false -} -[2021-09-24 02:11:56,302] [INFO] [engine.py:76:__init__] CONFIG: micro_batches=256 micro_batch_size=1 -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=0 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=3 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=1 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=2 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=131 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=128 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=129 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=194 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=193 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=192 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=195 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=64 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=66 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=65 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=67 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=32 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=33 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=35 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=130 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=97 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=96 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=99 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=98 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=224 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=225 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=226 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=227 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=160 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=163 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=161 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=162 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,606] [INFO] [engine.py:134:__init__] RANK=34 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -WARNING: could not find the metadata file /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - will not load any checkpoints and will start from random -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,752] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -[2021-09-24 02:11:56,753] [WARNING] [engine.py:1744:load_checkpoint] Unable to find latest file at /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/latest, if trying to load latest checkpoint please ensure this file exists or pass an explicit checkpoint tag when loading a checkpoint. -time (ms) | load-checkpoint: 1.91 -[after model, optimizer, and learning rate scheduler are built] datetime: 2021-09-24 02:11:56 -> building train, validation, and test datasets ... - > datasets target sizes (minimum size): - train: 300000000 - validation: 1638400 - test: 10240 -> building train, validation, and test datasets for GPT ... - > building dataset index ... - reading sizes... - reading pointers... - reading document index... - creating numpy buffer of mmap... - creating memory view of numpy buffer... - > finished creating indexed dataset in 0.214922 seconds - number of documents: 304230423 - > dataset split: - train: - document indices in [0, 288714672) total of 288714672 documents - validation: - document indices in [288714672, 303926193) total of 15211521 documents - test: - document indices in [303926193, 304230423) total of 304230 documents - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.337 seconds - total number of samples: 394611670 - total number of epochs: 3 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.309 seconds - total number of samples: 6927161 - total number of epochs: 1 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.060 seconds - total number of samples: 137384 - total number of epochs: 1 -> finished creating GPT datasets ... -[after dataloaders are built] datetime: 2021-09-24 02:12:03 -done with setup ... -training ... -time (ms) | model-and-optimizer-setup: 8062.72 | train/valid/test-data-iterators-setup: 5729.09 -[before the start of training step] datetime: 2021-09-24 02:12:03 -[2021-09-24 02:12:03,365] [INFO] [checkpointing.py:408:forward] Activation Checkpointing Information -[2021-09-24 02:12:03,365] [INFO] [checkpointing.py:409:forward] ----Partition Activations False, CPU CHECKPOINTING False -[2021-09-24 02:12:03,365] [INFO] [checkpointing.py:412:forward] ----contiguous Memory Checkpointing False with 32 total layers -[2021-09-24 02:12:03,365] [INFO] [checkpointing.py:415:forward] ----Synchronization False -[2021-09-24 02:12:03,365] [INFO] [checkpointing.py:416:forward] ----Profiling time in checkpointing False -[Rank 1] (after 1 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 21150.0 | max reserved: 21150.0 -[Rank 33] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 65] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 97] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 225] (after 1 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 129] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 193] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 161] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 2] (after 1 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 21150.0 | max reserved: 21150.0 -[Rank 34] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 226] (after 1 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 21700.0 | max reserved: 21700.0 -[Rank 66] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18778.0 | max reserved: 18778.0 -[Rank 98] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 130] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 194] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18650.0 | max reserved: 18650.0 -[Rank 162] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 0] (after 1 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 21470.0 | max reserved: 21470.0 -[Rank 64] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19252.0 | max reserved: 19252.0 -[Rank 32] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18868.0 | max reserved: 18868.0 -[Rank 128] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18868.0 | max reserved: 18868.0 -[Rank 96] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18868.0 | max reserved: 18868.0 -[Rank 224] (after 1 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 192] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18868.0 | max reserved: 18868.0 -[Rank 160] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18868.0 | max reserved: 18868.0 -[Rank 35] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 3] (after 1 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 21150.0 | max reserved: 21150.0 -[Rank 67] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18522.0 | max reserved: 18522.0 -[Rank 99] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 131] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18522.0 | max reserved: 18522.0 -[Rank 227] (after 1 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 21700.0 | max reserved: 21700.0 -[Rank 195] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 163] (after 1 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 - iteration 1/ 159576 | consumed samples: 16 | elapsed time per iteration (ms): 31536.2 | learning rate: 4.438E-09 | global batch size: 16 | lm loss: 1.426722E+01 | loss scale: 4096.0 | grad norm: 1863985.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2/ 159576 | consumed samples: 32 | elapsed time per iteration (ms): 13049.6 | learning rate: 8.876E-09 | global batch size: 16 | lm loss: 1.429125E+01 | loss scale: 4096.0 | grad norm: 1882741.499 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3/ 159576 | consumed samples: 48 | elapsed time per iteration (ms): 13671.4 | learning rate: 1.331E-08 | global batch size: 16 | lm loss: 1.421026E+01 | loss scale: 4096.0 | grad norm: 1871916.438 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4/ 159576 | consumed samples: 64 | elapsed time per iteration (ms): 13544.5 | learning rate: 1.775E-08 | global batch size: 16 | lm loss: 1.424627E+01 | loss scale: 4096.0 | grad norm: 1912485.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5/ 159576 | consumed samples: 80 | elapsed time per iteration (ms): 13955.0 | learning rate: 2.219E-08 | global batch size: 16 | lm loss: 1.421161E+01 | loss scale: 4096.0 | grad norm: 1873991.265 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6/ 159576 | consumed samples: 96 | elapsed time per iteration (ms): 13725.9 | learning rate: 2.663E-08 | global batch size: 16 | lm loss: 1.423833E+01 | loss scale: 4096.0 | grad norm: 1889068.937 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7/ 159576 | consumed samples: 112 | elapsed time per iteration (ms): 13496.8 | learning rate: 3.107E-08 | global batch size: 16 | lm loss: 1.423929E+01 | loss scale: 4096.0 | grad norm: 1864001.655 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8/ 159576 | consumed samples: 128 | elapsed time per iteration (ms): 13565.8 | learning rate: 3.550E-08 | global batch size: 16 | lm loss: 1.424760E+01 | loss scale: 4096.0 | grad norm: 1867381.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9/ 159576 | consumed samples: 144 | elapsed time per iteration (ms): 14076.3 | learning rate: 3.994E-08 | global batch size: 16 | lm loss: 1.418199E+01 | loss scale: 4096.0 | grad norm: 1902029.931 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10/ 159576 | consumed samples: 160 | elapsed time per iteration (ms): 13497.5 | learning rate: 4.438E-08 | global batch size: 16 | lm loss: 1.412427E+01 | loss scale: 4096.0 | grad norm: 1865649.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11/ 159576 | consumed samples: 176 | elapsed time per iteration (ms): 13459.5 | learning rate: 4.882E-08 | global batch size: 16 | lm loss: 1.407386E+01 | loss scale: 4096.0 | grad norm: 1861067.628 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12/ 159576 | consumed samples: 192 | elapsed time per iteration (ms): 13581.0 | learning rate: 5.325E-08 | global batch size: 16 | lm loss: 1.400436E+01 | loss scale: 4096.0 | grad norm: 1857208.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 13/ 159576 | consumed samples: 208 | elapsed time per iteration (ms): 13877.0 | learning rate: 5.769E-08 | global batch size: 16 | lm loss: 1.374212E+01 | loss scale: 4096.0 | grad norm: 1860712.228 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 14/ 159576 | consumed samples: 224 | elapsed time per iteration (ms): 13730.6 | learning rate: 6.213E-08 | global batch size: 16 | lm loss: 1.363158E+01 | loss scale: 4096.0 | grad norm: 1835837.890 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 15/ 159576 | consumed samples: 240 | elapsed time per iteration (ms): 13589.9 | learning rate: 6.657E-08 | global batch size: 16 | lm loss: 1.353429E+01 | loss scale: 4096.0 | grad norm: 1866742.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 16/ 159576 | consumed samples: 256 | elapsed time per iteration (ms): 13709.9 | learning rate: 7.101E-08 | global batch size: 16 | lm loss: 1.346230E+01 | loss scale: 4096.0 | grad norm: 1867848.322 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 17/ 159576 | consumed samples: 272 | elapsed time per iteration (ms): 13515.8 | learning rate: 7.544E-08 | global batch size: 16 | lm loss: 1.257517E+01 | loss scale: 4096.0 | grad norm: 1827444.965 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 18/ 159576 | consumed samples: 288 | elapsed time per iteration (ms): 13800.0 | learning rate: 7.988E-08 | global batch size: 16 | lm loss: 1.251998E+01 | loss scale: 4096.0 | grad norm: 2020558.797 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 19/ 159576 | consumed samples: 304 | elapsed time per iteration (ms): 13516.3 | learning rate: 8.432E-08 | global batch size: 16 | lm loss: 1.265157E+01 | loss scale: 4096.0 | grad norm: 2257407.748 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 20/ 159576 | consumed samples: 320 | elapsed time per iteration (ms): 13549.6 | learning rate: 8.876E-08 | global batch size: 16 | lm loss: 1.252521E+01 | loss scale: 4096.0 | grad norm: 2095375.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 21/ 159576 | consumed samples: 336 | elapsed time per iteration (ms): 13586.7 | learning rate: 9.320E-08 | global batch size: 16 | lm loss: 1.244903E+01 | loss scale: 4096.0 | grad norm: 2211855.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 22/ 159576 | consumed samples: 352 | elapsed time per iteration (ms): 14140.0 | learning rate: 9.763E-08 | global batch size: 16 | lm loss: 1.221426E+01 | loss scale: 4096.0 | grad norm: 2152853.946 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 23/ 159576 | consumed samples: 368 | elapsed time per iteration (ms): 13565.7 | learning rate: 1.021E-07 | global batch size: 16 | lm loss: 1.223387E+01 | loss scale: 4096.0 | grad norm: 2257726.245 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 24/ 159576 | consumed samples: 384 | elapsed time per iteration (ms): 13529.2 | learning rate: 1.065E-07 | global batch size: 16 | lm loss: 1.252795E+01 | loss scale: 4096.0 | grad norm: 2648402.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 25/ 159576 | consumed samples: 400 | elapsed time per iteration (ms): 13468.4 | learning rate: 1.109E-07 | global batch size: 16 | lm loss: 1.249682E+01 | loss scale: 4096.0 | grad norm: 2816711.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 26/ 159576 | consumed samples: 416 | elapsed time per iteration (ms): 13529.9 | learning rate: 1.154E-07 | global batch size: 16 | lm loss: 1.219784E+01 | loss scale: 4096.0 | grad norm: 2380750.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 27/ 159576 | consumed samples: 432 | elapsed time per iteration (ms): 13833.4 | learning rate: 1.198E-07 | global batch size: 16 | lm loss: 1.182601E+01 | loss scale: 4096.0 | grad norm: 2116005.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 28/ 159576 | consumed samples: 448 | elapsed time per iteration (ms): 13615.6 | learning rate: 1.243E-07 | global batch size: 16 | lm loss: 1.159655E+01 | loss scale: 4096.0 | grad norm: 1805209.516 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 29/ 159576 | consumed samples: 464 | elapsed time per iteration (ms): 13371.2 | learning rate: 1.287E-07 | global batch size: 16 | lm loss: 1.165552E+01 | loss scale: 4096.0 | grad norm: 1731569.615 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 30/ 159576 | consumed samples: 480 | elapsed time per iteration (ms): 13604.8 | learning rate: 1.331E-07 | global batch size: 16 | lm loss: 1.154380E+01 | loss scale: 4096.0 | grad norm: 1706578.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 31/ 159576 | consumed samples: 496 | elapsed time per iteration (ms): 13982.3 | learning rate: 1.376E-07 | global batch size: 16 | lm loss: 1.139362E+01 | loss scale: 4096.0 | grad norm: 1757980.169 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 32/ 159576 | consumed samples: 512 | elapsed time per iteration (ms): 13306.0 | learning rate: 1.420E-07 | global batch size: 16 | lm loss: 1.148209E+01 | loss scale: 4096.0 | grad norm: 1697993.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 33/ 159576 | consumed samples: 528 | elapsed time per iteration (ms): 13575.8 | learning rate: 1.464E-07 | global batch size: 16 | lm loss: 1.140995E+01 | loss scale: 4096.0 | grad norm: 1670562.081 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 34/ 159576 | consumed samples: 544 | elapsed time per iteration (ms): 13613.2 | learning rate: 1.509E-07 | global batch size: 16 | lm loss: 1.132776E+01 | loss scale: 4096.0 | grad norm: 1643305.715 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 35/ 159576 | consumed samples: 560 | elapsed time per iteration (ms): 13869.9 | learning rate: 1.553E-07 | global batch size: 16 | lm loss: 1.136237E+01 | loss scale: 4096.0 | grad norm: 1648846.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 36/ 159576 | consumed samples: 576 | elapsed time per iteration (ms): 13789.0 | learning rate: 1.598E-07 | global batch size: 16 | lm loss: 1.143323E+01 | loss scale: 4096.0 | grad norm: 1598861.192 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 37/ 159576 | consumed samples: 592 | elapsed time per iteration (ms): 13658.0 | learning rate: 1.642E-07 | global batch size: 16 | lm loss: 1.115875E+01 | loss scale: 4096.0 | grad norm: 1562919.350 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 38/ 159576 | consumed samples: 608 | elapsed time per iteration (ms): 13961.2 | learning rate: 1.686E-07 | global batch size: 16 | lm loss: 1.117768E+01 | loss scale: 4096.0 | grad norm: 1565543.705 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 39/ 159576 | consumed samples: 624 | elapsed time per iteration (ms): 13410.4 | learning rate: 1.731E-07 | global batch size: 16 | lm loss: 1.111340E+01 | loss scale: 4096.0 | grad norm: 1536768.356 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 40/ 159576 | consumed samples: 640 | elapsed time per iteration (ms): 13891.8 | learning rate: 1.775E-07 | global batch size: 16 | lm loss: 1.106657E+01 | loss scale: 4096.0 | grad norm: 1548421.837 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 41/ 159576 | consumed samples: 656 | elapsed time per iteration (ms): 13633.3 | learning rate: 1.820E-07 | global batch size: 16 | lm loss: 1.094995E+01 | loss scale: 4096.0 | grad norm: 1532446.839 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 42/ 159576 | consumed samples: 672 | elapsed time per iteration (ms): 13643.8 | learning rate: 1.864E-07 | global batch size: 16 | lm loss: 1.087856E+01 | loss scale: 4096.0 | grad norm: 1531337.842 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 43/ 159576 | consumed samples: 688 | elapsed time per iteration (ms): 13630.7 | learning rate: 1.908E-07 | global batch size: 16 | lm loss: 1.084412E+01 | loss scale: 4096.0 | grad norm: 1473539.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 44/ 159576 | consumed samples: 704 | elapsed time per iteration (ms): 14118.0 | learning rate: 1.953E-07 | global batch size: 16 | lm loss: 1.114596E+01 | loss scale: 4096.0 | grad norm: 1496700.678 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 45/ 159576 | consumed samples: 720 | elapsed time per iteration (ms): 13853.8 | learning rate: 1.997E-07 | global batch size: 16 | lm loss: 1.092829E+01 | loss scale: 4096.0 | grad norm: 1454980.052 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 46/ 159576 | consumed samples: 736 | elapsed time per iteration (ms): 13549.0 | learning rate: 2.041E-07 | global batch size: 16 | lm loss: 1.074461E+01 | loss scale: 4096.0 | grad norm: 1397083.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 47/ 159576 | consumed samples: 752 | elapsed time per iteration (ms): 13627.3 | learning rate: 2.086E-07 | global batch size: 16 | lm loss: 1.066580E+01 | loss scale: 4096.0 | grad norm: 1311670.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 48/ 159576 | consumed samples: 768 | elapsed time per iteration (ms): 13674.9 | learning rate: 2.130E-07 | global batch size: 16 | lm loss: 1.055744E+01 | loss scale: 4096.0 | grad norm: 1292299.744 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 49/ 159576 | consumed samples: 784 | elapsed time per iteration (ms): 13932.1 | learning rate: 2.175E-07 | global batch size: 16 | lm loss: 1.060610E+01 | loss scale: 4096.0 | grad norm: 1283482.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 50/ 159576 | consumed samples: 800 | elapsed time per iteration (ms): 13665.9 | learning rate: 2.219E-07 | global batch size: 16 | lm loss: 1.063007E+01 | loss scale: 4096.0 | grad norm: 1228203.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 51/ 159576 | consumed samples: 816 | elapsed time per iteration (ms): 13667.5 | learning rate: 2.263E-07 | global batch size: 16 | lm loss: 1.046357E+01 | loss scale: 4096.0 | grad norm: 1219490.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 52/ 159576 | consumed samples: 832 | elapsed time per iteration (ms): 13793.6 | learning rate: 2.308E-07 | global batch size: 16 | lm loss: 1.061804E+01 | loss scale: 4096.0 | grad norm: 1197068.783 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 53/ 159576 | consumed samples: 848 | elapsed time per iteration (ms): 14209.6 | learning rate: 2.352E-07 | global batch size: 16 | lm loss: 1.041930E+01 | loss scale: 4096.0 | grad norm: 1168890.772 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 54/ 159576 | consumed samples: 864 | elapsed time per iteration (ms): 13453.2 | learning rate: 2.396E-07 | global batch size: 16 | lm loss: 1.035855E+01 | loss scale: 4096.0 | grad norm: 1126594.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 55/ 159576 | consumed samples: 880 | elapsed time per iteration (ms): 13666.6 | learning rate: 2.441E-07 | global batch size: 16 | lm loss: 1.051081E+01 | loss scale: 4096.0 | grad norm: 1080949.187 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 56/ 159576 | consumed samples: 896 | elapsed time per iteration (ms): 13689.5 | learning rate: 2.485E-07 | global batch size: 16 | lm loss: 1.048364E+01 | loss scale: 4096.0 | grad norm: 1069119.479 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 57/ 159576 | consumed samples: 912 | elapsed time per iteration (ms): 14289.6 | learning rate: 2.530E-07 | global batch size: 16 | lm loss: 1.048154E+01 | loss scale: 4096.0 | grad norm: 1016407.938 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 58/ 159576 | consumed samples: 928 | elapsed time per iteration (ms): 13663.2 | learning rate: 2.574E-07 | global batch size: 16 | lm loss: 1.019213E+01 | loss scale: 4096.0 | grad norm: 982402.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 59/ 159576 | consumed samples: 944 | elapsed time per iteration (ms): 13704.5 | learning rate: 2.618E-07 | global batch size: 16 | lm loss: 1.019982E+01 | loss scale: 4096.0 | grad norm: 965254.453 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 60/ 159576 | consumed samples: 960 | elapsed time per iteration (ms): 13846.3 | learning rate: 2.663E-07 | global batch size: 16 | lm loss: 1.021626E+01 | loss scale: 4096.0 | grad norm: 926021.764 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 61/ 159576 | consumed samples: 976 | elapsed time per iteration (ms): 13469.9 | learning rate: 2.707E-07 | global batch size: 16 | lm loss: 1.008368E+01 | loss scale: 4096.0 | grad norm: 911608.476 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 62/ 159576 | consumed samples: 992 | elapsed time per iteration (ms): 13774.9 | learning rate: 2.751E-07 | global batch size: 16 | lm loss: 9.892099E+00 | loss scale: 4096.0 | grad norm: 882114.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 63/ 159576 | consumed samples: 1008 | elapsed time per iteration (ms): 13514.1 | learning rate: 2.796E-07 | global batch size: 16 | lm loss: 9.876393E+00 | loss scale: 4096.0 | grad norm: 834416.962 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 64/ 159576 | consumed samples: 1024 | elapsed time per iteration (ms): 13538.5 | learning rate: 2.840E-07 | global batch size: 16 | lm loss: 9.927294E+00 | loss scale: 4096.0 | grad norm: 814691.882 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 65/ 159576 | consumed samples: 1040 | elapsed time per iteration (ms): 13496.5 | learning rate: 2.885E-07 | global batch size: 16 | lm loss: 1.024293E+01 | loss scale: 4096.0 | grad norm: 821175.330 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 66/ 159576 | consumed samples: 1056 | elapsed time per iteration (ms): 14030.7 | learning rate: 2.929E-07 | global batch size: 16 | lm loss: 9.930872E+00 | loss scale: 4096.0 | grad norm: 759629.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 67/ 159576 | consumed samples: 1072 | elapsed time per iteration (ms): 13743.1 | learning rate: 2.973E-07 | global batch size: 16 | lm loss: 9.852800E+00 | loss scale: 4096.0 | grad norm: 734440.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 68/ 159576 | consumed samples: 1088 | elapsed time per iteration (ms): 13293.2 | learning rate: 3.018E-07 | global batch size: 16 | lm loss: 9.786448E+00 | loss scale: 4096.0 | grad norm: 702591.247 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 69/ 159576 | consumed samples: 1104 | elapsed time per iteration (ms): 13515.6 | learning rate: 3.062E-07 | global batch size: 16 | lm loss: 9.917148E+00 | loss scale: 4096.0 | grad norm: 689937.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 70/ 159576 | consumed samples: 1120 | elapsed time per iteration (ms): 13786.0 | learning rate: 3.107E-07 | global batch size: 16 | lm loss: 9.593161E+00 | loss scale: 4096.0 | grad norm: 634541.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 71/ 159576 | consumed samples: 1136 | elapsed time per iteration (ms): 13761.6 | learning rate: 3.151E-07 | global batch size: 16 | lm loss: 9.685747E+00 | loss scale: 4096.0 | grad norm: 620089.160 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 72/ 159576 | consumed samples: 1152 | elapsed time per iteration (ms): 13503.1 | learning rate: 3.195E-07 | global batch size: 16 | lm loss: 9.550736E+00 | loss scale: 4096.0 | grad norm: 592735.898 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 73/ 159576 | consumed samples: 1168 | elapsed time per iteration (ms): 13574.6 | learning rate: 3.240E-07 | global batch size: 16 | lm loss: 9.780053E+00 | loss scale: 4096.0 | grad norm: 578902.468 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 74/ 159576 | consumed samples: 1184 | elapsed time per iteration (ms): 13563.6 | learning rate: 3.284E-07 | global batch size: 16 | lm loss: 9.660094E+00 | loss scale: 4096.0 | grad norm: 549632.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 75/ 159576 | consumed samples: 1200 | elapsed time per iteration (ms): 13751.3 | learning rate: 3.328E-07 | global batch size: 16 | lm loss: 9.715110E+00 | loss scale: 4096.0 | grad norm: 523457.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 76/ 159576 | consumed samples: 1216 | elapsed time per iteration (ms): 13613.9 | learning rate: 3.373E-07 | global batch size: 16 | lm loss: 9.548697E+00 | loss scale: 4096.0 | grad norm: 559789.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 77/ 159576 | consumed samples: 1232 | elapsed time per iteration (ms): 13668.9 | learning rate: 3.417E-07 | global batch size: 16 | lm loss: 9.395579E+00 | loss scale: 4096.0 | grad norm: 516053.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 78/ 159576 | consumed samples: 1248 | elapsed time per iteration (ms): 13540.8 | learning rate: 3.462E-07 | global batch size: 16 | lm loss: 9.450207E+00 | loss scale: 4096.0 | grad norm: 491518.990 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 79/ 159576 | consumed samples: 1264 | elapsed time per iteration (ms): 13951.5 | learning rate: 3.506E-07 | global batch size: 16 | lm loss: 9.312221E+00 | loss scale: 4096.0 | grad norm: 445025.682 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 80/ 159576 | consumed samples: 1280 | elapsed time per iteration (ms): 13710.1 | learning rate: 3.550E-07 | global batch size: 16 | lm loss: 9.362122E+00 | loss scale: 4096.0 | grad norm: 498046.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 81/ 159576 | consumed samples: 1296 | elapsed time per iteration (ms): 13653.8 | learning rate: 3.595E-07 | global batch size: 16 | lm loss: 9.684261E+00 | loss scale: 4096.0 | grad norm: 460137.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 82/ 159576 | consumed samples: 1312 | elapsed time per iteration (ms): 13416.1 | learning rate: 3.639E-07 | global batch size: 16 | lm loss: 9.111031E+00 | loss scale: 4096.0 | grad norm: 462196.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 83/ 159576 | consumed samples: 1328 | elapsed time per iteration (ms): 13589.7 | learning rate: 3.683E-07 | global batch size: 16 | lm loss: 9.424231E+00 | loss scale: 4096.0 | grad norm: 387492.278 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 84/ 159576 | consumed samples: 1344 | elapsed time per iteration (ms): 13890.8 | learning rate: 3.728E-07 | global batch size: 16 | lm loss: 9.225885E+00 | loss scale: 4096.0 | grad norm: 477146.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 85/ 159576 | consumed samples: 1360 | elapsed time per iteration (ms): 13578.1 | learning rate: 3.772E-07 | global batch size: 16 | lm loss: 9.449253E+00 | loss scale: 4096.0 | grad norm: 498838.088 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 86/ 159576 | consumed samples: 1376 | elapsed time per iteration (ms): 13600.8 | learning rate: 3.817E-07 | global batch size: 16 | lm loss: 9.186915E+00 | loss scale: 4096.0 | grad norm: 359821.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 87/ 159576 | consumed samples: 1392 | elapsed time per iteration (ms): 13578.0 | learning rate: 3.861E-07 | global batch size: 16 | lm loss: 9.169426E+00 | loss scale: 4096.0 | grad norm: 336361.334 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 88/ 159576 | consumed samples: 1408 | elapsed time per iteration (ms): 14258.1 | learning rate: 3.905E-07 | global batch size: 16 | lm loss: 9.174639E+00 | loss scale: 4096.0 | grad norm: 513262.304 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 89/ 159576 | consumed samples: 1424 | elapsed time per iteration (ms): 13350.5 | learning rate: 3.950E-07 | global batch size: 16 | lm loss: 9.322023E+00 | loss scale: 4096.0 | grad norm: 417913.413 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 90/ 159576 | consumed samples: 1440 | elapsed time per iteration (ms): 13582.0 | learning rate: 3.994E-07 | global batch size: 16 | lm loss: 9.319530E+00 | loss scale: 4096.0 | grad norm: 326159.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 91/ 159576 | consumed samples: 1456 | elapsed time per iteration (ms): 13577.6 | learning rate: 4.038E-07 | global batch size: 16 | lm loss: 9.305362E+00 | loss scale: 4096.0 | grad norm: 312504.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 92/ 159576 | consumed samples: 1472 | elapsed time per iteration (ms): 13979.9 | learning rate: 4.083E-07 | global batch size: 16 | lm loss: 8.797226E+00 | loss scale: 4096.0 | grad norm: 299274.584 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 93/ 159576 | consumed samples: 1488 | elapsed time per iteration (ms): 13685.6 | learning rate: 4.127E-07 | global batch size: 16 | lm loss: 9.470177E+00 | loss scale: 4096.0 | grad norm: 889931.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 94/ 159576 | consumed samples: 1504 | elapsed time per iteration (ms): 13625.1 | learning rate: 4.172E-07 | global batch size: 16 | lm loss: 9.601658E+00 | loss scale: 4096.0 | grad norm: 858157.270 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 95/ 159576 | consumed samples: 1520 | elapsed time per iteration (ms): 13713.7 | learning rate: 4.216E-07 | global batch size: 16 | lm loss: 9.093191E+00 | loss scale: 4096.0 | grad norm: 308888.782 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 96/ 159576 | consumed samples: 1536 | elapsed time per iteration (ms): 13441.7 | learning rate: 4.260E-07 | global batch size: 16 | lm loss: 9.258781E+00 | loss scale: 4096.0 | grad norm: 285375.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 97/ 159576 | consumed samples: 1552 | elapsed time per iteration (ms): 13952.1 | learning rate: 4.305E-07 | global batch size: 16 | lm loss: 9.267257E+00 | loss scale: 4096.0 | grad norm: 266598.437 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 98/ 159576 | consumed samples: 1568 | elapsed time per iteration (ms): 13570.4 | learning rate: 4.349E-07 | global batch size: 16 | lm loss: 9.302748E+00 | loss scale: 4096.0 | grad norm: 430050.353 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 99/ 159576 | consumed samples: 1584 | elapsed time per iteration (ms): 13655.7 | learning rate: 4.393E-07 | global batch size: 16 | lm loss: 9.206352E+00 | loss scale: 4096.0 | grad norm: 522965.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 100/ 159576 | consumed samples: 1600 | elapsed time per iteration (ms): 13606.3 | learning rate: 4.438E-07 | global batch size: 16 | lm loss: 9.212991E+00 | loss scale: 4096.0 | grad norm: 351294.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 101/ 159576 | consumed samples: 1616 | elapsed time per iteration (ms): 14021.3 | learning rate: 4.482E-07 | global batch size: 16 | lm loss: 9.392309E+00 | loss scale: 4096.0 | grad norm: 249407.405 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 102/ 159576 | consumed samples: 1632 | elapsed time per iteration (ms): 13722.5 | learning rate: 4.527E-07 | global batch size: 16 | lm loss: 9.173745E+00 | loss scale: 4096.0 | grad norm: 230190.700 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 103/ 159576 | consumed samples: 1648 | elapsed time per iteration (ms): 13481.3 | learning rate: 4.571E-07 | global batch size: 16 | lm loss: 9.060183E+00 | loss scale: 4096.0 | grad norm: 535519.642 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 104/ 159576 | consumed samples: 1664 | elapsed time per iteration (ms): 13573.2 | learning rate: 4.615E-07 | global batch size: 16 | lm loss: 8.820353E+00 | loss scale: 4096.0 | grad norm: 252106.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 105/ 159576 | consumed samples: 1680 | elapsed time per iteration (ms): 13679.8 | learning rate: 4.660E-07 | global batch size: 16 | lm loss: 8.907228E+00 | loss scale: 4096.0 | grad norm: 227304.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 106/ 159576 | consumed samples: 1696 | elapsed time per iteration (ms): 13833.6 | learning rate: 4.704E-07 | global batch size: 16 | lm loss: 8.920894E+00 | loss scale: 4096.0 | grad norm: 226622.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 107/ 159576 | consumed samples: 1712 | elapsed time per iteration (ms): 13577.9 | learning rate: 4.749E-07 | global batch size: 16 | lm loss: 8.839094E+00 | loss scale: 4096.0 | grad norm: 188033.687 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 108/ 159576 | consumed samples: 1728 | elapsed time per iteration (ms): 13620.7 | learning rate: 4.793E-07 | global batch size: 16 | lm loss: 9.072345E+00 | loss scale: 4096.0 | grad norm: 405511.072 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 109/ 159576 | consumed samples: 1744 | elapsed time per iteration (ms): 13608.5 | learning rate: 4.837E-07 | global batch size: 16 | lm loss: 8.981932E+00 | loss scale: 4096.0 | grad norm: 326365.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 110/ 159576 | consumed samples: 1760 | elapsed time per iteration (ms): 13945.7 | learning rate: 4.882E-07 | global batch size: 16 | lm loss: 8.900158E+00 | loss scale: 4096.0 | grad norm: 183771.399 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 111/ 159576 | consumed samples: 1776 | elapsed time per iteration (ms): 13542.6 | learning rate: 4.926E-07 | global batch size: 16 | lm loss: 8.908926E+00 | loss scale: 4096.0 | grad norm: 189581.109 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 112/ 159576 | consumed samples: 1792 | elapsed time per iteration (ms): 13715.6 | learning rate: 4.970E-07 | global batch size: 16 | lm loss: 8.738115E+00 | loss scale: 4096.0 | grad norm: 176974.824 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 113/ 159576 | consumed samples: 1808 | elapsed time per iteration (ms): 13456.9 | learning rate: 5.015E-07 | global batch size: 16 | lm loss: 9.185429E+00 | loss scale: 4096.0 | grad norm: 452577.591 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 114/ 159576 | consumed samples: 1824 | elapsed time per iteration (ms): 14039.5 | learning rate: 5.059E-07 | global batch size: 16 | lm loss: 9.235853E+00 | loss scale: 4096.0 | grad norm: 567475.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 115/ 159576 | consumed samples: 1840 | elapsed time per iteration (ms): 13568.6 | learning rate: 5.104E-07 | global batch size: 16 | lm loss: 8.848898E+00 | loss scale: 4096.0 | grad norm: 182062.035 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 116/ 159576 | consumed samples: 1856 | elapsed time per iteration (ms): 13607.1 | learning rate: 5.148E-07 | global batch size: 16 | lm loss: 8.955499E+00 | loss scale: 4096.0 | grad norm: 179172.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 117/ 159576 | consumed samples: 1872 | elapsed time per iteration (ms): 13798.7 | learning rate: 5.192E-07 | global batch size: 16 | lm loss: 8.835221E+00 | loss scale: 4096.0 | grad norm: 168846.925 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 118/ 159576 | consumed samples: 1888 | elapsed time per iteration (ms): 13424.3 | learning rate: 5.237E-07 | global batch size: 16 | lm loss: 9.120043E+00 | loss scale: 4096.0 | grad norm: 304218.818 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 119/ 159576 | consumed samples: 1904 | elapsed time per iteration (ms): 13992.7 | learning rate: 5.281E-07 | global batch size: 16 | lm loss: 8.877877E+00 | loss scale: 4096.0 | grad norm: 328004.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 120/ 159576 | consumed samples: 1920 | elapsed time per iteration (ms): 13739.9 | learning rate: 5.325E-07 | global batch size: 16 | lm loss: 9.091492E+00 | loss scale: 4096.0 | grad norm: 542667.397 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 121/ 159576 | consumed samples: 1936 | elapsed time per iteration (ms): 13438.9 | learning rate: 5.370E-07 | global batch size: 16 | lm loss: 8.963889E+00 | loss scale: 4096.0 | grad norm: 173633.066 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 122/ 159576 | consumed samples: 1952 | elapsed time per iteration (ms): 13659.9 | learning rate: 5.414E-07 | global batch size: 16 | lm loss: 8.973601E+00 | loss scale: 4096.0 | grad norm: 154883.483 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 123/ 159576 | consumed samples: 1968 | elapsed time per iteration (ms): 14034.9 | learning rate: 5.459E-07 | global batch size: 16 | lm loss: 8.932154E+00 | loss scale: 4096.0 | grad norm: 191305.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 124/ 159576 | consumed samples: 1984 | elapsed time per iteration (ms): 13642.6 | learning rate: 5.503E-07 | global batch size: 16 | lm loss: 8.718765E+00 | loss scale: 4096.0 | grad norm: 141927.967 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 125/ 159576 | consumed samples: 2000 | elapsed time per iteration (ms): 13607.3 | learning rate: 5.547E-07 | global batch size: 16 | lm loss: 9.022717E+00 | loss scale: 4096.0 | grad norm: 530230.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 126/ 159576 | consumed samples: 2016 | elapsed time per iteration (ms): 13623.2 | learning rate: 5.592E-07 | global batch size: 16 | lm loss: 9.160154E+00 | loss scale: 4096.0 | grad norm: 525377.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 127/ 159576 | consumed samples: 2032 | elapsed time per iteration (ms): 13944.5 | learning rate: 5.636E-07 | global batch size: 16 | lm loss: 8.602621E+00 | loss scale: 4096.0 | grad norm: 180832.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 128/ 159576 | consumed samples: 2048 | elapsed time per iteration (ms): 13652.1 | learning rate: 5.680E-07 | global batch size: 16 | lm loss: 8.848473E+00 | loss scale: 4096.0 | grad norm: 159006.909 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 129/ 159576 | consumed samples: 2064 | elapsed time per iteration (ms): 13619.4 | learning rate: 5.725E-07 | global batch size: 16 | lm loss: 8.697285E+00 | loss scale: 4096.0 | grad norm: 166208.955 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 130/ 159576 | consumed samples: 2080 | elapsed time per iteration (ms): 13649.8 | learning rate: 5.769E-07 | global batch size: 16 | lm loss: 8.738346E+00 | loss scale: 4096.0 | grad norm: 142582.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 131/ 159576 | consumed samples: 2096 | elapsed time per iteration (ms): 13648.8 | learning rate: 5.814E-07 | global batch size: 16 | lm loss: 8.628532E+00 | loss scale: 4096.0 | grad norm: 119745.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 132/ 159576 | consumed samples: 2112 | elapsed time per iteration (ms): 13855.7 | learning rate: 5.858E-07 | global batch size: 16 | lm loss: 8.681314E+00 | loss scale: 4096.0 | grad norm: 238581.530 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 133/ 159576 | consumed samples: 2128 | elapsed time per iteration (ms): 13614.3 | learning rate: 5.902E-07 | global batch size: 16 | lm loss: 8.853155E+00 | loss scale: 4096.0 | grad norm: 190597.797 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 134/ 159576 | consumed samples: 2144 | elapsed time per iteration (ms): 13742.8 | learning rate: 5.947E-07 | global batch size: 16 | lm loss: 8.840850E+00 | loss scale: 4096.0 | grad norm: 157001.058 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 135/ 159576 | consumed samples: 2160 | elapsed time per iteration (ms): 13481.4 | learning rate: 5.991E-07 | global batch size: 16 | lm loss: 8.721090E+00 | loss scale: 4096.0 | grad norm: 120761.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 136/ 159576 | consumed samples: 2176 | elapsed time per iteration (ms): 14037.0 | learning rate: 6.036E-07 | global batch size: 16 | lm loss: 8.786610E+00 | loss scale: 4096.0 | grad norm: 109166.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 137/ 159576 | consumed samples: 2192 | elapsed time per iteration (ms): 13631.2 | learning rate: 6.080E-07 | global batch size: 16 | lm loss: 8.825349E+00 | loss scale: 4096.0 | grad norm: 393039.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 138/ 159576 | consumed samples: 2208 | elapsed time per iteration (ms): 13698.2 | learning rate: 6.124E-07 | global batch size: 16 | lm loss: 8.681873E+00 | loss scale: 4096.0 | grad norm: 210924.024 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 139/ 159576 | consumed samples: 2224 | elapsed time per iteration (ms): 13641.8 | learning rate: 6.169E-07 | global batch size: 16 | lm loss: 8.758416E+00 | loss scale: 4096.0 | grad norm: 111138.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 140/ 159576 | consumed samples: 2240 | elapsed time per iteration (ms): 13650.3 | learning rate: 6.213E-07 | global batch size: 16 | lm loss: 8.646829E+00 | loss scale: 4096.0 | grad norm: 115663.463 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 141/ 159576 | consumed samples: 2256 | elapsed time per iteration (ms): 14097.3 | learning rate: 6.257E-07 | global batch size: 16 | lm loss: 8.653087E+00 | loss scale: 4096.0 | grad norm: 142126.653 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 142/ 159576 | consumed samples: 2272 | elapsed time per iteration (ms): 13468.2 | learning rate: 6.302E-07 | global batch size: 16 | lm loss: 8.647311E+00 | loss scale: 4096.0 | grad norm: 163914.852 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 143/ 159576 | consumed samples: 2288 | elapsed time per iteration (ms): 13544.7 | learning rate: 6.346E-07 | global batch size: 16 | lm loss: 8.564240E+00 | loss scale: 4096.0 | grad norm: 159952.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 144/ 159576 | consumed samples: 2304 | elapsed time per iteration (ms): 13642.1 | learning rate: 6.391E-07 | global batch size: 16 | lm loss: 8.789017E+00 | loss scale: 4096.0 | grad norm: 169255.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 145/ 159576 | consumed samples: 2320 | elapsed time per iteration (ms): 14181.4 | learning rate: 6.435E-07 | global batch size: 16 | lm loss: 8.811962E+00 | loss scale: 4096.0 | grad norm: 127162.884 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 146/ 159576 | consumed samples: 2336 | elapsed time per iteration (ms): 13492.3 | learning rate: 6.479E-07 | global batch size: 16 | lm loss: 8.774818E+00 | loss scale: 4096.0 | grad norm: 110483.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 147/ 159576 | consumed samples: 2352 | elapsed time per iteration (ms): 13671.3 | learning rate: 6.524E-07 | global batch size: 16 | lm loss: 8.753700E+00 | loss scale: 4096.0 | grad norm: 128181.260 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 148/ 159576 | consumed samples: 2368 | elapsed time per iteration (ms): 13675.0 | learning rate: 6.568E-07 | global batch size: 16 | lm loss: 8.742964E+00 | loss scale: 4096.0 | grad norm: 140698.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 149/ 159576 | consumed samples: 2384 | elapsed time per iteration (ms): 14154.8 | learning rate: 6.612E-07 | global batch size: 16 | lm loss: 8.705631E+00 | loss scale: 4096.0 | grad norm: 284561.708 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 150/ 159576 | consumed samples: 2400 | elapsed time per iteration (ms): 13301.3 | learning rate: 6.657E-07 | global batch size: 16 | lm loss: 8.639321E+00 | loss scale: 4096.0 | grad norm: 158457.469 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 151/ 159576 | consumed samples: 2416 | elapsed time per iteration (ms): 13553.4 | learning rate: 6.701E-07 | global batch size: 16 | lm loss: 8.747204E+00 | loss scale: 4096.0 | grad norm: 217035.827 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 152/ 159576 | consumed samples: 2432 | elapsed time per iteration (ms): 13577.6 | learning rate: 6.746E-07 | global batch size: 16 | lm loss: 8.711011E+00 | loss scale: 4096.0 | grad norm: 170149.010 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 153/ 159576 | consumed samples: 2448 | elapsed time per iteration (ms): 13522.0 | learning rate: 6.790E-07 | global batch size: 16 | lm loss: 8.717499E+00 | loss scale: 4096.0 | grad norm: 103133.580 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 154/ 159576 | consumed samples: 2464 | elapsed time per iteration (ms): 13883.8 | learning rate: 6.834E-07 | global batch size: 16 | lm loss: 8.587013E+00 | loss scale: 4096.0 | grad norm: 99765.078 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 155/ 159576 | consumed samples: 2480 | elapsed time per iteration (ms): 13554.0 | learning rate: 6.879E-07 | global batch size: 16 | lm loss: 8.698885E+00 | loss scale: 4096.0 | grad norm: 282680.223 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 156/ 159576 | consumed samples: 2496 | elapsed time per iteration (ms): 13692.4 | learning rate: 6.923E-07 | global batch size: 16 | lm loss: 9.289864E+00 | loss scale: 4096.0 | grad norm: 609278.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 157/ 159576 | consumed samples: 2512 | elapsed time per iteration (ms): 13306.0 | learning rate: 6.967E-07 | global batch size: 16 | lm loss: 8.803203E+00 | loss scale: 4096.0 | grad norm: 221182.708 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 02:48:11] PULSE: tr8-104B is waiting to be scheduled (1159457_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 02:48:11] PULSE: tr8-104B is scheduled to start in 18:26:36 (at 2021-09-24T21:14:48) (1161605 on 'gpu_p13' partition) -[2021-09-24 02:48:11] PULSE: tr8-104B is running for 37:09 since 2021-09-24T02:11:02 (1161730 on 'gpu_p13' partition (r6i4n7,r6i5n[7-8],r6i6n[0,6,8],r6i7n3,r7i2n[2,4-5],r7i3n2,r7i6n[2-4],r7i7n[3,7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i3n[0-2],r8i5n[3-4],r8i7n[3-6,8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 158/ 159576 | consumed samples: 2528 | elapsed time per iteration (ms): 13873.2 | learning rate: 7.012E-07 | global batch size: 16 | lm loss: 8.628306E+00 | loss scale: 4096.0 | grad norm: 200507.061 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 159/ 159576 | consumed samples: 2544 | elapsed time per iteration (ms): 13466.2 | learning rate: 7.056E-07 | global batch size: 16 | lm loss: 8.632781E+00 | loss scale: 4096.0 | grad norm: 103638.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 160/ 159576 | consumed samples: 2560 | elapsed time per iteration (ms): 13494.3 | learning rate: 7.101E-07 | global batch size: 16 | lm loss: 8.596104E+00 | loss scale: 4096.0 | grad norm: 92105.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 161/ 159576 | consumed samples: 2576 | elapsed time per iteration (ms): 13517.5 | learning rate: 7.145E-07 | global batch size: 16 | lm loss: 8.408714E+00 | loss scale: 4096.0 | grad norm: 78965.627 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 162/ 159576 | consumed samples: 2592 | elapsed time per iteration (ms): 13540.1 | learning rate: 7.189E-07 | global batch size: 16 | lm loss: 9.134837E+00 | loss scale: 4096.0 | grad norm: 524949.559 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 163/ 159576 | consumed samples: 2608 | elapsed time per iteration (ms): 13879.1 | learning rate: 7.234E-07 | global batch size: 16 | lm loss: 8.601346E+00 | loss scale: 4096.0 | grad norm: 206465.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 164/ 159576 | consumed samples: 2624 | elapsed time per iteration (ms): 13564.5 | learning rate: 7.278E-07 | global batch size: 16 | lm loss: 8.734079E+00 | loss scale: 4096.0 | grad norm: 159985.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 165/ 159576 | consumed samples: 2640 | elapsed time per iteration (ms): 13607.4 | learning rate: 7.322E-07 | global batch size: 16 | lm loss: 8.629238E+00 | loss scale: 4096.0 | grad norm: 89678.564 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 166/ 159576 | consumed samples: 2656 | elapsed time per iteration (ms): 13687.7 | learning rate: 7.367E-07 | global batch size: 16 | lm loss: 8.753635E+00 | loss scale: 4096.0 | grad norm: 108761.613 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 167/ 159576 | consumed samples: 2672 | elapsed time per iteration (ms): 14101.4 | learning rate: 7.411E-07 | global batch size: 16 | lm loss: 8.647141E+00 | loss scale: 4096.0 | grad norm: 78778.670 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 168/ 159576 | consumed samples: 2688 | elapsed time per iteration (ms): 13827.5 | learning rate: 7.456E-07 | global batch size: 16 | lm loss: 8.838135E+00 | loss scale: 4096.0 | grad norm: 301360.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 169/ 159576 | consumed samples: 2704 | elapsed time per iteration (ms): 13776.5 | learning rate: 7.500E-07 | global batch size: 16 | lm loss: 8.865972E+00 | loss scale: 4096.0 | grad norm: 230779.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 170/ 159576 | consumed samples: 2720 | elapsed time per iteration (ms): 13667.3 | learning rate: 7.544E-07 | global batch size: 16 | lm loss: 8.716210E+00 | loss scale: 4096.0 | grad norm: 133087.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 171/ 159576 | consumed samples: 2736 | elapsed time per iteration (ms): 13974.1 | learning rate: 7.589E-07 | global batch size: 16 | lm loss: 8.726005E+00 | loss scale: 4096.0 | grad norm: 112595.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 172/ 159576 | consumed samples: 2752 | elapsed time per iteration (ms): 13644.3 | learning rate: 7.633E-07 | global batch size: 16 | lm loss: 8.704071E+00 | loss scale: 4096.0 | grad norm: 92111.748 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 173/ 159576 | consumed samples: 2768 | elapsed time per iteration (ms): 13586.4 | learning rate: 7.678E-07 | global batch size: 16 | lm loss: 8.823001E+00 | loss scale: 4096.0 | grad norm: 93068.020 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 174/ 159576 | consumed samples: 2784 | elapsed time per iteration (ms): 13629.3 | learning rate: 7.722E-07 | global batch size: 16 | lm loss: 8.521597E+00 | loss scale: 4096.0 | grad norm: 79887.666 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 175/ 159576 | consumed samples: 2800 | elapsed time per iteration (ms): 13647.0 | learning rate: 7.766E-07 | global batch size: 16 | lm loss: 9.370278E+00 | loss scale: 4096.0 | grad norm: 576797.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 176/ 159576 | consumed samples: 2816 | elapsed time per iteration (ms): 13993.8 | learning rate: 7.811E-07 | global batch size: 16 | lm loss: 9.255205E+00 | loss scale: 4096.0 | grad norm: 337846.372 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 177/ 159576 | consumed samples: 2832 | elapsed time per iteration (ms): 13778.2 | learning rate: 7.855E-07 | global batch size: 16 | lm loss: 9.038449E+00 | loss scale: 4096.0 | grad norm: 339366.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 178/ 159576 | consumed samples: 2848 | elapsed time per iteration (ms): 13515.3 | learning rate: 7.899E-07 | global batch size: 16 | lm loss: 8.771539E+00 | loss scale: 4096.0 | grad norm: 216761.610 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 179/ 159576 | consumed samples: 2864 | elapsed time per iteration (ms): 13657.6 | learning rate: 7.944E-07 | global batch size: 16 | lm loss: 8.718536E+00 | loss scale: 4096.0 | grad norm: 103470.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 180/ 159576 | consumed samples: 2880 | elapsed time per iteration (ms): 14095.5 | learning rate: 7.988E-07 | global batch size: 16 | lm loss: 8.968449E+00 | loss scale: 4096.0 | grad norm: 88300.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 181/ 159576 | consumed samples: 2896 | elapsed time per iteration (ms): 13570.0 | learning rate: 8.033E-07 | global batch size: 16 | lm loss: 8.743597E+00 | loss scale: 4096.0 | grad norm: 73637.354 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 182/ 159576 | consumed samples: 2912 | elapsed time per iteration (ms): 13631.2 | learning rate: 8.077E-07 | global batch size: 16 | lm loss: 8.650385E+00 | loss scale: 4096.0 | grad norm: 170612.165 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 183/ 159576 | consumed samples: 2928 | elapsed time per iteration (ms): 13666.1 | learning rate: 8.121E-07 | global batch size: 16 | lm loss: 8.764441E+00 | loss scale: 4096.0 | grad norm: 157032.537 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 184/ 159576 | consumed samples: 2944 | elapsed time per iteration (ms): 14033.7 | learning rate: 8.166E-07 | global batch size: 16 | lm loss: 8.546231E+00 | loss scale: 4096.0 | grad norm: 68818.140 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 185/ 159576 | consumed samples: 2960 | elapsed time per iteration (ms): 13755.2 | learning rate: 8.210E-07 | global batch size: 16 | lm loss: 8.605597E+00 | loss scale: 4096.0 | grad norm: 245599.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 186/ 159576 | consumed samples: 2976 | elapsed time per iteration (ms): 13693.9 | learning rate: 8.254E-07 | global batch size: 16 | lm loss: 8.735710E+00 | loss scale: 4096.0 | grad norm: 193090.020 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 187/ 159576 | consumed samples: 2992 | elapsed time per iteration (ms): 13666.7 | learning rate: 8.299E-07 | global batch size: 16 | lm loss: 8.800616E+00 | loss scale: 4096.0 | grad norm: 121643.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 188/ 159576 | consumed samples: 3008 | elapsed time per iteration (ms): 13617.1 | learning rate: 8.343E-07 | global batch size: 16 | lm loss: 8.450140E+00 | loss scale: 4096.0 | grad norm: 91010.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 189/ 159576 | consumed samples: 3024 | elapsed time per iteration (ms): 14107.4 | learning rate: 8.388E-07 | global batch size: 16 | lm loss: 8.680673E+00 | loss scale: 4096.0 | grad norm: 171815.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 190/ 159576 | consumed samples: 3040 | elapsed time per iteration (ms): 13662.7 | learning rate: 8.432E-07 | global batch size: 16 | lm loss: 8.619300E+00 | loss scale: 4096.0 | grad norm: 80825.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 191/ 159576 | consumed samples: 3056 | elapsed time per iteration (ms): 13715.7 | learning rate: 8.476E-07 | global batch size: 16 | lm loss: 8.438683E+00 | loss scale: 4096.0 | grad norm: 68255.978 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 192/ 159576 | consumed samples: 3072 | elapsed time per iteration (ms): 13611.5 | learning rate: 8.521E-07 | global batch size: 16 | lm loss: 8.685935E+00 | loss scale: 4096.0 | grad norm: 100702.747 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 193/ 159576 | consumed samples: 3088 | elapsed time per iteration (ms): 14234.2 | learning rate: 8.565E-07 | global batch size: 16 | lm loss: 8.644808E+00 | loss scale: 4096.0 | grad norm: 193299.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 194/ 159576 | consumed samples: 3104 | elapsed time per iteration (ms): 13631.4 | learning rate: 8.609E-07 | global batch size: 16 | lm loss: 8.574228E+00 | loss scale: 4096.0 | grad norm: 141638.439 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 195/ 159576 | consumed samples: 3120 | elapsed time per iteration (ms): 13610.1 | learning rate: 8.654E-07 | global batch size: 16 | lm loss: 8.461662E+00 | loss scale: 4096.0 | grad norm: 102623.541 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 196/ 159576 | consumed samples: 3136 | elapsed time per iteration (ms): 13581.2 | learning rate: 8.698E-07 | global batch size: 16 | lm loss: 8.478310E+00 | loss scale: 4096.0 | grad norm: 64740.797 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 197/ 159576 | consumed samples: 3152 | elapsed time per iteration (ms): 13626.3 | learning rate: 8.743E-07 | global batch size: 16 | lm loss: 8.468125E+00 | loss scale: 4096.0 | grad norm: 113590.460 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 198/ 159576 | consumed samples: 3168 | elapsed time per iteration (ms): 14045.8 | learning rate: 8.787E-07 | global batch size: 16 | lm loss: 8.800446E+00 | loss scale: 4096.0 | grad norm: 157117.309 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 199/ 159576 | consumed samples: 3184 | elapsed time per iteration (ms): 13670.2 | learning rate: 8.831E-07 | global batch size: 16 | lm loss: 8.530574E+00 | loss scale: 4096.0 | grad norm: 71020.347 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 200/ 159576 | consumed samples: 3200 | elapsed time per iteration (ms): 13673.4 | learning rate: 8.876E-07 | global batch size: 16 | lm loss: 8.573134E+00 | loss scale: 4096.0 | grad norm: 68974.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 201/ 159576 | consumed samples: 3216 | elapsed time per iteration (ms): 13793.0 | learning rate: 8.920E-07 | global batch size: 16 | lm loss: 8.408599E+00 | loss scale: 4096.0 | grad norm: 69080.768 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 202/ 159576 | consumed samples: 3232 | elapsed time per iteration (ms): 13826.3 | learning rate: 8.964E-07 | global batch size: 16 | lm loss: 8.511511E+00 | loss scale: 4096.0 | grad norm: 111260.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 203/ 159576 | consumed samples: 3248 | elapsed time per iteration (ms): 13532.8 | learning rate: 9.009E-07 | global batch size: 16 | lm loss: 8.359414E+00 | loss scale: 4096.0 | grad norm: 178104.845 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 204/ 159576 | consumed samples: 3264 | elapsed time per iteration (ms): 13664.5 | learning rate: 9.053E-07 | global batch size: 16 | lm loss: 8.641071E+00 | loss scale: 4096.0 | grad norm: 200697.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 205/ 159576 | consumed samples: 3280 | elapsed time per iteration (ms): 13644.0 | learning rate: 9.098E-07 | global batch size: 16 | lm loss: 8.579686E+00 | loss scale: 4096.0 | grad norm: 127286.357 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 206/ 159576 | consumed samples: 3296 | elapsed time per iteration (ms): 14372.0 | learning rate: 9.142E-07 | global batch size: 16 | lm loss: 8.340457E+00 | loss scale: 4096.0 | grad norm: 79901.241 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 207/ 159576 | consumed samples: 3312 | elapsed time per iteration (ms): 13542.0 | learning rate: 9.186E-07 | global batch size: 16 | lm loss: 8.573874E+00 | loss scale: 4096.0 | grad norm: 54182.244 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 208/ 159576 | consumed samples: 3328 | elapsed time per iteration (ms): 13770.4 | learning rate: 9.231E-07 | global batch size: 16 | lm loss: 8.671753E+00 | loss scale: 4096.0 | grad norm: 118528.691 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 209/ 159576 | consumed samples: 3344 | elapsed time per iteration (ms): 13735.7 | learning rate: 9.275E-07 | global batch size: 16 | lm loss: 8.323320E+00 | loss scale: 4096.0 | grad norm: 84996.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 210/ 159576 | consumed samples: 3360 | elapsed time per iteration (ms): 13465.7 | learning rate: 9.320E-07 | global batch size: 16 | lm loss: 8.521966E+00 | loss scale: 4096.0 | grad norm: 58490.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 211/ 159576 | consumed samples: 3376 | elapsed time per iteration (ms): 14045.3 | learning rate: 9.364E-07 | global batch size: 16 | lm loss: 8.366361E+00 | loss scale: 4096.0 | grad norm: 60420.660 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 212/ 159576 | consumed samples: 3392 | elapsed time per iteration (ms): 13641.0 | learning rate: 9.408E-07 | global batch size: 16 | lm loss: 8.510538E+00 | loss scale: 4096.0 | grad norm: 107003.263 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 213/ 159576 | consumed samples: 3408 | elapsed time per iteration (ms): 13705.1 | learning rate: 9.453E-07 | global batch size: 16 | lm loss: 8.749462E+00 | loss scale: 4096.0 | grad norm: 127548.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 214/ 159576 | consumed samples: 3424 | elapsed time per iteration (ms): 13700.1 | learning rate: 9.497E-07 | global batch size: 16 | lm loss: 8.406161E+00 | loss scale: 4096.0 | grad norm: 77133.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 215/ 159576 | consumed samples: 3440 | elapsed time per iteration (ms): 14278.2 | learning rate: 9.541E-07 | global batch size: 16 | lm loss: 8.418405E+00 | loss scale: 4096.0 | grad norm: 62254.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 216/ 159576 | consumed samples: 3456 | elapsed time per iteration (ms): 13592.8 | learning rate: 9.586E-07 | global batch size: 16 | lm loss: 8.472538E+00 | loss scale: 4096.0 | grad norm: 50530.895 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 217/ 159576 | consumed samples: 3472 | elapsed time per iteration (ms): 13518.7 | learning rate: 9.630E-07 | global batch size: 16 | lm loss: 8.448650E+00 | loss scale: 4096.0 | grad norm: 80646.746 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 218/ 159576 | consumed samples: 3488 | elapsed time per iteration (ms): 13661.2 | learning rate: 9.675E-07 | global batch size: 16 | lm loss: 7.734177E+00 | loss scale: 4096.0 | grad norm: 149486.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 219/ 159576 | consumed samples: 3504 | elapsed time per iteration (ms): 14068.7 | learning rate: 9.719E-07 | global batch size: 16 | lm loss: 8.294590E+00 | loss scale: 4096.0 | grad norm: 56571.951 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 220/ 159576 | consumed samples: 3520 | elapsed time per iteration (ms): 13630.3 | learning rate: 9.763E-07 | global batch size: 16 | lm loss: 8.257124E+00 | loss scale: 4096.0 | grad norm: 62046.509 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 221/ 159576 | consumed samples: 3536 | elapsed time per iteration (ms): 13703.1 | learning rate: 9.808E-07 | global batch size: 16 | lm loss: 8.288898E+00 | loss scale: 4096.0 | grad norm: 59852.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 222/ 159576 | consumed samples: 3552 | elapsed time per iteration (ms): 13772.5 | learning rate: 9.852E-07 | global batch size: 16 | lm loss: 8.155066E+00 | loss scale: 4096.0 | grad norm: 58014.079 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 223/ 159576 | consumed samples: 3568 | elapsed time per iteration (ms): 13771.9 | learning rate: 9.896E-07 | global batch size: 16 | lm loss: 8.263331E+00 | loss scale: 4096.0 | grad norm: 63268.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 224/ 159576 | consumed samples: 3584 | elapsed time per iteration (ms): 14010.9 | learning rate: 9.941E-07 | global batch size: 16 | lm loss: 8.163802E+00 | loss scale: 4096.0 | grad norm: 57272.250 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 225/ 159576 | consumed samples: 3600 | elapsed time per iteration (ms): 13593.2 | learning rate: 9.985E-07 | global batch size: 16 | lm loss: 8.163125E+00 | loss scale: 4096.0 | grad norm: 42586.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 226/ 159576 | consumed samples: 3616 | elapsed time per iteration (ms): 13655.1 | learning rate: 1.003E-06 | global batch size: 16 | lm loss: 8.360060E+00 | loss scale: 4096.0 | grad norm: 122218.171 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 227/ 159576 | consumed samples: 3632 | elapsed time per iteration (ms): 13648.6 | learning rate: 1.007E-06 | global batch size: 16 | lm loss: 8.255043E+00 | loss scale: 4096.0 | grad norm: 85521.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 228/ 159576 | consumed samples: 3648 | elapsed time per iteration (ms): 14030.4 | learning rate: 1.012E-06 | global batch size: 16 | lm loss: 8.261985E+00 | loss scale: 4096.0 | grad norm: 67005.701 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 229/ 159576 | consumed samples: 3664 | elapsed time per iteration (ms): 13712.9 | learning rate: 1.016E-06 | global batch size: 16 | lm loss: 8.186491E+00 | loss scale: 4096.0 | grad norm: 56484.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 230/ 159576 | consumed samples: 3680 | elapsed time per iteration (ms): 13908.9 | learning rate: 1.021E-06 | global batch size: 16 | lm loss: 8.405298E+00 | loss scale: 4096.0 | grad norm: 76846.855 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 231/ 159576 | consumed samples: 3696 | elapsed time per iteration (ms): 13436.7 | learning rate: 1.025E-06 | global batch size: 16 | lm loss: 8.396565E+00 | loss scale: 4096.0 | grad norm: 65903.685 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 232/ 159576 | consumed samples: 3712 | elapsed time per iteration (ms): 13847.3 | learning rate: 1.030E-06 | global batch size: 16 | lm loss: 8.280029E+00 | loss scale: 4096.0 | grad norm: 49376.518 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 233/ 159576 | consumed samples: 3728 | elapsed time per iteration (ms): 13817.4 | learning rate: 1.034E-06 | global batch size: 16 | lm loss: 8.356775E+00 | loss scale: 4096.0 | grad norm: 59866.023 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 234/ 159576 | consumed samples: 3744 | elapsed time per iteration (ms): 13586.3 | learning rate: 1.038E-06 | global batch size: 16 | lm loss: 8.429869E+00 | loss scale: 4096.0 | grad norm: 177436.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 235/ 159576 | consumed samples: 3760 | elapsed time per iteration (ms): 13599.7 | learning rate: 1.043E-06 | global batch size: 16 | lm loss: 8.434436E+00 | loss scale: 4096.0 | grad norm: 135413.910 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 236/ 159576 | consumed samples: 3776 | elapsed time per iteration (ms): 13650.1 | learning rate: 1.047E-06 | global batch size: 16 | lm loss: 8.271558E+00 | loss scale: 4096.0 | grad norm: 90861.034 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 237/ 159576 | consumed samples: 3792 | elapsed time per iteration (ms): 14163.4 | learning rate: 1.052E-06 | global batch size: 16 | lm loss: 8.303068E+00 | loss scale: 4096.0 | grad norm: 54299.730 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 238/ 159576 | consumed samples: 3808 | elapsed time per iteration (ms): 13595.2 | learning rate: 1.056E-06 | global batch size: 16 | lm loss: 8.246891E+00 | loss scale: 4096.0 | grad norm: 58398.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 239/ 159576 | consumed samples: 3824 | elapsed time per iteration (ms): 13633.1 | learning rate: 1.061E-06 | global batch size: 16 | lm loss: 8.223282E+00 | loss scale: 4096.0 | grad norm: 58574.140 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 240/ 159576 | consumed samples: 3840 | elapsed time per iteration (ms): 13623.5 | learning rate: 1.065E-06 | global batch size: 16 | lm loss: 8.408007E+00 | loss scale: 4096.0 | grad norm: 128668.081 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 241/ 159576 | consumed samples: 3856 | elapsed time per iteration (ms): 14073.7 | learning rate: 1.070E-06 | global batch size: 16 | lm loss: 8.490035E+00 | loss scale: 4096.0 | grad norm: 228763.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 242/ 159576 | consumed samples: 3872 | elapsed time per iteration (ms): 13568.7 | learning rate: 1.074E-06 | global batch size: 16 | lm loss: 8.217072E+00 | loss scale: 4096.0 | grad norm: 54955.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 243/ 159576 | consumed samples: 3888 | elapsed time per iteration (ms): 13649.7 | learning rate: 1.078E-06 | global batch size: 16 | lm loss: 8.280759E+00 | loss scale: 4096.0 | grad norm: 70277.633 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 244/ 159576 | consumed samples: 3904 | elapsed time per iteration (ms): 13743.3 | learning rate: 1.083E-06 | global batch size: 16 | lm loss: 8.266622E+00 | loss scale: 4096.0 | grad norm: 52088.661 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 245/ 159576 | consumed samples: 3920 | elapsed time per iteration (ms): 13760.9 | learning rate: 1.087E-06 | global batch size: 16 | lm loss: 8.186391E+00 | loss scale: 4096.0 | grad norm: 45303.389 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 246/ 159576 | consumed samples: 3936 | elapsed time per iteration (ms): 13869.6 | learning rate: 1.092E-06 | global batch size: 16 | lm loss: 8.217053E+00 | loss scale: 4096.0 | grad norm: 66052.613 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 247/ 159576 | consumed samples: 3952 | elapsed time per iteration (ms): 13595.0 | learning rate: 1.096E-06 | global batch size: 16 | lm loss: 8.218720E+00 | loss scale: 4096.0 | grad norm: 63154.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 248/ 159576 | consumed samples: 3968 | elapsed time per iteration (ms): 13605.0 | learning rate: 1.101E-06 | global batch size: 16 | lm loss: 8.214328E+00 | loss scale: 4096.0 | grad norm: 54827.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 249/ 159576 | consumed samples: 3984 | elapsed time per iteration (ms): 13572.6 | learning rate: 1.105E-06 | global batch size: 16 | lm loss: 8.289627E+00 | loss scale: 4096.0 | grad norm: 112939.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 250/ 159576 | consumed samples: 4000 | elapsed time per iteration (ms): 13869.8 | learning rate: 1.109E-06 | global batch size: 16 | lm loss: 8.362014E+00 | loss scale: 4096.0 | grad norm: 56746.466 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 251/ 159576 | consumed samples: 4016 | elapsed time per iteration (ms): 13620.5 | learning rate: 1.114E-06 | global batch size: 16 | lm loss: 8.189938E+00 | loss scale: 4096.0 | grad norm: 56152.282 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 252/ 159576 | consumed samples: 4032 | elapsed time per iteration (ms): 13708.2 | learning rate: 1.118E-06 | global batch size: 16 | lm loss: 8.356908E+00 | loss scale: 4096.0 | grad norm: 78498.467 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 253/ 159576 | consumed samples: 4048 | elapsed time per iteration (ms): 13478.4 | learning rate: 1.123E-06 | global batch size: 16 | lm loss: 8.047684E+00 | loss scale: 4096.0 | grad norm: 66252.882 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 254/ 159576 | consumed samples: 4064 | elapsed time per iteration (ms): 14231.8 | learning rate: 1.127E-06 | global batch size: 16 | lm loss: 8.279363E+00 | loss scale: 4096.0 | grad norm: 85125.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 255/ 159576 | consumed samples: 4080 | elapsed time per iteration (ms): 13522.4 | learning rate: 1.132E-06 | global batch size: 16 | lm loss: 8.159877E+00 | loss scale: 4096.0 | grad norm: 48952.267 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 256/ 159576 | consumed samples: 4096 | elapsed time per iteration (ms): 13553.5 | learning rate: 1.136E-06 | global batch size: 16 | lm loss: 8.154376E+00 | loss scale: 4096.0 | grad norm: 41715.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 257/ 159576 | consumed samples: 4112 | elapsed time per iteration (ms): 13537.5 | learning rate: 1.141E-06 | global batch size: 16 | lm loss: 8.247561E+00 | loss scale: 4096.0 | grad norm: 57864.708 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 258/ 159576 | consumed samples: 4128 | elapsed time per iteration (ms): 13659.5 | learning rate: 1.145E-06 | global batch size: 16 | lm loss: 8.167631E+00 | loss scale: 4096.0 | grad norm: 45439.745 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 259/ 159576 | consumed samples: 4144 | elapsed time per iteration (ms): 14023.4 | learning rate: 1.149E-06 | global batch size: 16 | lm loss: 8.081510E+00 | loss scale: 4096.0 | grad norm: 54108.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 260/ 159576 | consumed samples: 4160 | elapsed time per iteration (ms): 13447.5 | learning rate: 1.154E-06 | global batch size: 16 | lm loss: 8.074065E+00 | loss scale: 4096.0 | grad norm: 45799.989 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 261/ 159576 | consumed samples: 4176 | elapsed time per iteration (ms): 13604.0 | learning rate: 1.158E-06 | global batch size: 16 | lm loss: 8.134088E+00 | loss scale: 4096.0 | grad norm: 34426.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 262/ 159576 | consumed samples: 4192 | elapsed time per iteration (ms): 13632.5 | learning rate: 1.163E-06 | global batch size: 16 | lm loss: 8.331153E+00 | loss scale: 4096.0 | grad norm: 241742.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 263/ 159576 | consumed samples: 4208 | elapsed time per iteration (ms): 14049.0 | learning rate: 1.167E-06 | global batch size: 16 | lm loss: 8.300336E+00 | loss scale: 4096.0 | grad norm: 89382.639 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 264/ 159576 | consumed samples: 4224 | elapsed time per iteration (ms): 13554.0 | learning rate: 1.172E-06 | global batch size: 16 | lm loss: 8.285131E+00 | loss scale: 4096.0 | grad norm: 56471.162 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 265/ 159576 | consumed samples: 4240 | elapsed time per iteration (ms): 13594.4 | learning rate: 1.176E-06 | global batch size: 16 | lm loss: 8.247953E+00 | loss scale: 4096.0 | grad norm: 59934.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 266/ 159576 | consumed samples: 4256 | elapsed time per iteration (ms): 13722.5 | learning rate: 1.180E-06 | global batch size: 16 | lm loss: 8.086367E+00 | loss scale: 4096.0 | grad norm: 49794.894 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 267/ 159576 | consumed samples: 4272 | elapsed time per iteration (ms): 13925.6 | learning rate: 1.185E-06 | global batch size: 16 | lm loss: 8.364625E+00 | loss scale: 4096.0 | grad norm: 198667.364 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 268/ 159576 | consumed samples: 4288 | elapsed time per iteration (ms): 13685.9 | learning rate: 1.189E-06 | global batch size: 16 | lm loss: 8.378025E+00 | loss scale: 4096.0 | grad norm: 206726.678 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 269/ 159576 | consumed samples: 4304 | elapsed time per iteration (ms): 13784.2 | learning rate: 1.194E-06 | global batch size: 16 | lm loss: 8.309950E+00 | loss scale: 4096.0 | grad norm: 102692.516 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 270/ 159576 | consumed samples: 4320 | elapsed time per iteration (ms): 13426.6 | learning rate: 1.198E-06 | global batch size: 16 | lm loss: 8.437682E+00 | loss scale: 4096.0 | grad norm: 53779.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 271/ 159576 | consumed samples: 4336 | elapsed time per iteration (ms): 13590.5 | learning rate: 1.203E-06 | global batch size: 16 | lm loss: 8.180303E+00 | loss scale: 4096.0 | grad norm: 41837.204 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 272/ 159576 | consumed samples: 4352 | elapsed time per iteration (ms): 13918.1 | learning rate: 1.207E-06 | global batch size: 16 | lm loss: 8.269817E+00 | loss scale: 4096.0 | grad norm: 60250.869 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 273/ 159576 | consumed samples: 4368 | elapsed time per iteration (ms): 13764.9 | learning rate: 1.212E-06 | global batch size: 16 | lm loss: 8.196259E+00 | loss scale: 4096.0 | grad norm: 51310.508 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 274/ 159576 | consumed samples: 4384 | elapsed time per iteration (ms): 13543.7 | learning rate: 1.216E-06 | global batch size: 16 | lm loss: 8.111527E+00 | loss scale: 4096.0 | grad norm: 62869.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 275/ 159576 | consumed samples: 4400 | elapsed time per iteration (ms): 13741.6 | learning rate: 1.220E-06 | global batch size: 16 | lm loss: 8.196915E+00 | loss scale: 4096.0 | grad norm: 56382.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 276/ 159576 | consumed samples: 4416 | elapsed time per iteration (ms): 14418.6 | learning rate: 1.225E-06 | global batch size: 16 | lm loss: 8.163618E+00 | loss scale: 4096.0 | grad norm: 59897.745 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 277/ 159576 | consumed samples: 4432 | elapsed time per iteration (ms): 13488.6 | learning rate: 1.229E-06 | global batch size: 16 | lm loss: 8.232466E+00 | loss scale: 4096.0 | grad norm: 106883.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 278/ 159576 | consumed samples: 4448 | elapsed time per iteration (ms): 13680.7 | learning rate: 1.234E-06 | global batch size: 16 | lm loss: 8.285415E+00 | loss scale: 4096.0 | grad norm: 52155.013 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 279/ 159576 | consumed samples: 4464 | elapsed time per iteration (ms): 13663.3 | learning rate: 1.238E-06 | global batch size: 16 | lm loss: 8.221471E+00 | loss scale: 4096.0 | grad norm: 43151.453 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 280/ 159576 | consumed samples: 4480 | elapsed time per iteration (ms): 13783.3 | learning rate: 1.243E-06 | global batch size: 16 | lm loss: 7.827011E+00 | loss scale: 4096.0 | grad norm: 60081.852 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 281/ 159576 | consumed samples: 4496 | elapsed time per iteration (ms): 13993.1 | learning rate: 1.247E-06 | global batch size: 16 | lm loss: 8.016405E+00 | loss scale: 4096.0 | grad norm: 60969.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 282/ 159576 | consumed samples: 4512 | elapsed time per iteration (ms): 13747.2 | learning rate: 1.251E-06 | global batch size: 16 | lm loss: 8.205744E+00 | loss scale: 4096.0 | grad norm: 64657.162 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 283/ 159576 | consumed samples: 4528 | elapsed time per iteration (ms): 13732.1 | learning rate: 1.256E-06 | global batch size: 16 | lm loss: 8.225381E+00 | loss scale: 4096.0 | grad norm: 46007.720 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 284/ 159576 | consumed samples: 4544 | elapsed time per iteration (ms): 13701.8 | learning rate: 1.260E-06 | global batch size: 16 | lm loss: 8.069484E+00 | loss scale: 4096.0 | grad norm: 50539.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 285/ 159576 | consumed samples: 4560 | elapsed time per iteration (ms): 13774.1 | learning rate: 1.265E-06 | global batch size: 16 | lm loss: 8.313256E+00 | loss scale: 4096.0 | grad norm: 75301.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 286/ 159576 | consumed samples: 4576 | elapsed time per iteration (ms): 13700.1 | learning rate: 1.269E-06 | global batch size: 16 | lm loss: 8.296308E+00 | loss scale: 4096.0 | grad norm: 109402.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 287/ 159576 | consumed samples: 4592 | elapsed time per iteration (ms): 13678.1 | learning rate: 1.274E-06 | global batch size: 16 | lm loss: 8.245502E+00 | loss scale: 4096.0 | grad norm: 53639.635 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 288/ 159576 | consumed samples: 4608 | elapsed time per iteration (ms): 13698.6 | learning rate: 1.278E-06 | global batch size: 16 | lm loss: 8.137961E+00 | loss scale: 4096.0 | grad norm: 42750.465 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 289/ 159576 | consumed samples: 4624 | elapsed time per iteration (ms): 14172.7 | learning rate: 1.283E-06 | global batch size: 16 | lm loss: 8.187901E+00 | loss scale: 4096.0 | grad norm: 108265.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 290/ 159576 | consumed samples: 4640 | elapsed time per iteration (ms): 13663.7 | learning rate: 1.287E-06 | global batch size: 16 | lm loss: 8.092007E+00 | loss scale: 4096.0 | grad norm: 61613.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 291/ 159576 | consumed samples: 4656 | elapsed time per iteration (ms): 13802.2 | learning rate: 1.291E-06 | global batch size: 16 | lm loss: 8.140871E+00 | loss scale: 4096.0 | grad norm: 73138.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 292/ 159576 | consumed samples: 4672 | elapsed time per iteration (ms): 13588.8 | learning rate: 1.296E-06 | global batch size: 16 | lm loss: 8.096482E+00 | loss scale: 4096.0 | grad norm: 56947.365 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 293/ 159576 | consumed samples: 4688 | elapsed time per iteration (ms): 13692.3 | learning rate: 1.300E-06 | global batch size: 16 | lm loss: 8.261303E+00 | loss scale: 4096.0 | grad norm: 50306.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 294/ 159576 | consumed samples: 4704 | elapsed time per iteration (ms): 13953.1 | learning rate: 1.305E-06 | global batch size: 16 | lm loss: 8.088846E+00 | loss scale: 4096.0 | grad norm: 70651.882 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 295/ 159576 | consumed samples: 4720 | elapsed time per iteration (ms): 13681.7 | learning rate: 1.309E-06 | global batch size: 16 | lm loss: 8.216883E+00 | loss scale: 4096.0 | grad norm: 109748.850 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 296/ 159576 | consumed samples: 4736 | elapsed time per iteration (ms): 13680.1 | learning rate: 1.314E-06 | global batch size: 16 | lm loss: 8.011025E+00 | loss scale: 4096.0 | grad norm: 57863.308 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 297/ 159576 | consumed samples: 4752 | elapsed time per iteration (ms): 13766.7 | learning rate: 1.318E-06 | global batch size: 16 | lm loss: 8.023094E+00 | loss scale: 4096.0 | grad norm: 39732.348 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 298/ 159576 | consumed samples: 4768 | elapsed time per iteration (ms): 14056.0 | learning rate: 1.322E-06 | global batch size: 16 | lm loss: 8.085699E+00 | loss scale: 4096.0 | grad norm: 93534.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 299/ 159576 | consumed samples: 4784 | elapsed time per iteration (ms): 13507.1 | learning rate: 1.327E-06 | global batch size: 16 | lm loss: 8.410425E+00 | loss scale: 4096.0 | grad norm: 42550.581 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 300/ 159576 | consumed samples: 4800 | elapsed time per iteration (ms): 13670.9 | learning rate: 1.331E-06 | global batch size: 16 | lm loss: 8.125405E+00 | loss scale: 4096.0 | grad norm: 37244.445 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 301/ 159576 | consumed samples: 4816 | elapsed time per iteration (ms): 13643.0 | learning rate: 1.336E-06 | global batch size: 16 | lm loss: 7.945562E+00 | loss scale: 4096.0 | grad norm: 37921.680 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 302/ 159576 | consumed samples: 4832 | elapsed time per iteration (ms): 14097.2 | learning rate: 1.340E-06 | global batch size: 16 | lm loss: 8.073545E+00 | loss scale: 4096.0 | grad norm: 80879.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 303/ 159576 | consumed samples: 4848 | elapsed time per iteration (ms): 13625.2 | learning rate: 1.345E-06 | global batch size: 16 | lm loss: 8.224352E+00 | loss scale: 4096.0 | grad norm: 75920.356 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 304/ 159576 | consumed samples: 4864 | elapsed time per iteration (ms): 13709.0 | learning rate: 1.349E-06 | global batch size: 16 | lm loss: 8.025059E+00 | loss scale: 4096.0 | grad norm: 39535.605 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 305/ 159576 | consumed samples: 4880 | elapsed time per iteration (ms): 13741.5 | learning rate: 1.354E-06 | global batch size: 16 | lm loss: 8.094482E+00 | loss scale: 4096.0 | grad norm: 40630.922 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 306/ 159576 | consumed samples: 4896 | elapsed time per iteration (ms): 13523.7 | learning rate: 1.358E-06 | global batch size: 16 | lm loss: 8.135887E+00 | loss scale: 4096.0 | grad norm: 80825.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 307/ 159576 | consumed samples: 4912 | elapsed time per iteration (ms): 14093.4 | learning rate: 1.362E-06 | global batch size: 16 | lm loss: 8.292034E+00 | loss scale: 4096.0 | grad norm: 86171.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 308/ 159576 | consumed samples: 4928 | elapsed time per iteration (ms): 13647.9 | learning rate: 1.367E-06 | global batch size: 16 | lm loss: 8.204563E+00 | loss scale: 4096.0 | grad norm: 46698.010 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 309/ 159576 | consumed samples: 4944 | elapsed time per iteration (ms): 13637.2 | learning rate: 1.371E-06 | global batch size: 16 | lm loss: 8.033182E+00 | loss scale: 4096.0 | grad norm: 42089.185 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 310/ 159576 | consumed samples: 4960 | elapsed time per iteration (ms): 13700.6 | learning rate: 1.376E-06 | global batch size: 16 | lm loss: 8.048797E+00 | loss scale: 4096.0 | grad norm: 56022.805 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 311/ 159576 | consumed samples: 4976 | elapsed time per iteration (ms): 14085.5 | learning rate: 1.380E-06 | global batch size: 16 | lm loss: 7.623003E+00 | loss scale: 4096.0 | grad norm: 72171.220 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 312/ 159576 | consumed samples: 4992 | elapsed time per iteration (ms): 13830.9 | learning rate: 1.385E-06 | global batch size: 16 | lm loss: 8.082812E+00 | loss scale: 4096.0 | grad norm: 39681.453 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 313/ 159576 | consumed samples: 5008 | elapsed time per iteration (ms): 13533.9 | learning rate: 1.389E-06 | global batch size: 16 | lm loss: 8.116117E+00 | loss scale: 4096.0 | grad norm: 33726.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 314/ 159576 | consumed samples: 5024 | elapsed time per iteration (ms): 13637.3 | learning rate: 1.393E-06 | global batch size: 16 | lm loss: 8.210217E+00 | loss scale: 4096.0 | grad norm: 89402.073 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 315/ 159576 | consumed samples: 5040 | elapsed time per iteration (ms): 14136.6 | learning rate: 1.398E-06 | global batch size: 16 | lm loss: 7.798199E+00 | loss scale: 4096.0 | grad norm: 83566.570 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 316/ 159576 | consumed samples: 5056 | elapsed time per iteration (ms): 13651.3 | learning rate: 1.402E-06 | global batch size: 16 | lm loss: 8.066372E+00 | loss scale: 4096.0 | grad norm: 38768.697 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 317/ 159576 | consumed samples: 5072 | elapsed time per iteration (ms): 13641.7 | learning rate: 1.407E-06 | global batch size: 16 | lm loss: 7.876265E+00 | loss scale: 4096.0 | grad norm: 36174.406 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 318/ 159576 | consumed samples: 5088 | elapsed time per iteration (ms): 13653.8 | learning rate: 1.411E-06 | global batch size: 16 | lm loss: 7.979768E+00 | loss scale: 4096.0 | grad norm: 66651.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 319/ 159576 | consumed samples: 5104 | elapsed time per iteration (ms): 13755.9 | learning rate: 1.416E-06 | global batch size: 16 | lm loss: 8.094232E+00 | loss scale: 4096.0 | grad norm: 79088.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 320/ 159576 | consumed samples: 5120 | elapsed time per iteration (ms): 13900.8 | learning rate: 1.420E-06 | global batch size: 16 | lm loss: 8.113304E+00 | loss scale: 4096.0 | grad norm: 52331.401 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 321/ 159576 | consumed samples: 5136 | elapsed time per iteration (ms): 13649.9 | learning rate: 1.425E-06 | global batch size: 16 | lm loss: 8.128990E+00 | loss scale: 4096.0 | grad norm: 46927.679 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 322/ 159576 | consumed samples: 5152 | elapsed time per iteration (ms): 13693.6 | learning rate: 1.429E-06 | global batch size: 16 | lm loss: 8.486778E+00 | loss scale: 4096.0 | grad norm: 89462.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 323/ 159576 | consumed samples: 5168 | elapsed time per iteration (ms): 13699.8 | learning rate: 1.433E-06 | global batch size: 16 | lm loss: 8.051263E+00 | loss scale: 4096.0 | grad norm: 42680.523 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 324/ 159576 | consumed samples: 5184 | elapsed time per iteration (ms): 14041.8 | learning rate: 1.438E-06 | global batch size: 16 | lm loss: 8.181097E+00 | loss scale: 4096.0 | grad norm: 43801.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 325/ 159576 | consumed samples: 5200 | elapsed time per iteration (ms): 13711.0 | learning rate: 1.442E-06 | global batch size: 16 | lm loss: 8.171723E+00 | loss scale: 4096.0 | grad norm: 47748.407 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 326/ 159576 | consumed samples: 5216 | elapsed time per iteration (ms): 13743.3 | learning rate: 1.447E-06 | global batch size: 16 | lm loss: 8.035454E+00 | loss scale: 4096.0 | grad norm: 58353.227 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 327/ 159576 | consumed samples: 5232 | elapsed time per iteration (ms): 13602.7 | learning rate: 1.451E-06 | global batch size: 16 | lm loss: 8.021453E+00 | loss scale: 4096.0 | grad norm: 44165.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 328/ 159576 | consumed samples: 5248 | elapsed time per iteration (ms): 13748.9 | learning rate: 1.456E-06 | global batch size: 16 | lm loss: 8.051726E+00 | loss scale: 4096.0 | grad norm: 35138.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 329/ 159576 | consumed samples: 5264 | elapsed time per iteration (ms): 13961.7 | learning rate: 1.460E-06 | global batch size: 16 | lm loss: 7.960547E+00 | loss scale: 4096.0 | grad norm: 41197.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 330/ 159576 | consumed samples: 5280 | elapsed time per iteration (ms): 13633.4 | learning rate: 1.464E-06 | global batch size: 16 | lm loss: 8.084079E+00 | loss scale: 4096.0 | grad norm: 43199.182 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 331/ 159576 | consumed samples: 5296 | elapsed time per iteration (ms): 13678.9 | learning rate: 1.469E-06 | global batch size: 16 | lm loss: 8.243130E+00 | loss scale: 4096.0 | grad norm: 39935.584 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 332/ 159576 | consumed samples: 5312 | elapsed time per iteration (ms): 13653.3 | learning rate: 1.473E-06 | global batch size: 16 | lm loss: 8.148146E+00 | loss scale: 4096.0 | grad norm: 31710.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 333/ 159576 | consumed samples: 5328 | elapsed time per iteration (ms): 13982.9 | learning rate: 1.478E-06 | global batch size: 16 | lm loss: 8.055049E+00 | loss scale: 4096.0 | grad norm: 40555.458 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 334/ 159576 | consumed samples: 5344 | elapsed time per iteration (ms): 13576.5 | learning rate: 1.482E-06 | global batch size: 16 | lm loss: 8.154724E+00 | loss scale: 4096.0 | grad norm: 98189.157 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 335/ 159576 | consumed samples: 5360 | elapsed time per iteration (ms): 13666.3 | learning rate: 1.487E-06 | global batch size: 16 | lm loss: 8.056485E+00 | loss scale: 4096.0 | grad norm: 53277.066 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 336/ 159576 | consumed samples: 5376 | elapsed time per iteration (ms): 13667.7 | learning rate: 1.491E-06 | global batch size: 16 | lm loss: 7.902112E+00 | loss scale: 4096.0 | grad norm: 35520.620 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 337/ 159576 | consumed samples: 5392 | elapsed time per iteration (ms): 14189.1 | learning rate: 1.496E-06 | global batch size: 16 | lm loss: 8.211933E+00 | loss scale: 4096.0 | grad norm: 102636.452 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 338/ 159576 | consumed samples: 5408 | elapsed time per iteration (ms): 13538.3 | learning rate: 1.500E-06 | global batch size: 16 | lm loss: 8.077993E+00 | loss scale: 4096.0 | grad norm: 74161.424 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 339/ 159576 | consumed samples: 5424 | elapsed time per iteration (ms): 13690.1 | learning rate: 1.504E-06 | global batch size: 16 | lm loss: 8.002722E+00 | loss scale: 4096.0 | grad norm: 41178.202 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 340/ 159576 | consumed samples: 5440 | elapsed time per iteration (ms): 13761.4 | learning rate: 1.509E-06 | global batch size: 16 | lm loss: 8.070647E+00 | loss scale: 4096.0 | grad norm: 146660.160 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 341/ 159576 | consumed samples: 5456 | elapsed time per iteration (ms): 13679.6 | learning rate: 1.513E-06 | global batch size: 16 | lm loss: 8.211810E+00 | loss scale: 4096.0 | grad norm: 56011.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 342/ 159576 | consumed samples: 5472 | elapsed time per iteration (ms): 13958.7 | learning rate: 1.518E-06 | global batch size: 16 | lm loss: 8.028828E+00 | loss scale: 4096.0 | grad norm: 45507.509 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 343/ 159576 | consumed samples: 5488 | elapsed time per iteration (ms): 13796.1 | learning rate: 1.522E-06 | global batch size: 16 | lm loss: 8.000618E+00 | loss scale: 4096.0 | grad norm: 41366.016 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 344/ 159576 | consumed samples: 5504 | elapsed time per iteration (ms): 13566.5 | learning rate: 1.527E-06 | global batch size: 16 | lm loss: 8.106353E+00 | loss scale: 4096.0 | grad norm: 86487.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 345/ 159576 | consumed samples: 5520 | elapsed time per iteration (ms): 13617.7 | learning rate: 1.531E-06 | global batch size: 16 | lm loss: 8.130958E+00 | loss scale: 4096.0 | grad norm: 65559.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 346/ 159576 | consumed samples: 5536 | elapsed time per iteration (ms): 14006.3 | learning rate: 1.536E-06 | global batch size: 16 | lm loss: 8.100373E+00 | loss scale: 4096.0 | grad norm: 50918.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 347/ 159576 | consumed samples: 5552 | elapsed time per iteration (ms): 13652.0 | learning rate: 1.540E-06 | global batch size: 16 | lm loss: 8.193462E+00 | loss scale: 4096.0 | grad norm: 49482.923 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 348/ 159576 | consumed samples: 5568 | elapsed time per iteration (ms): 13785.4 | learning rate: 1.544E-06 | global batch size: 16 | lm loss: 8.185720E+00 | loss scale: 4096.0 | grad norm: 33616.818 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 349/ 159576 | consumed samples: 5584 | elapsed time per iteration (ms): 13534.7 | learning rate: 1.549E-06 | global batch size: 16 | lm loss: 7.997324E+00 | loss scale: 4096.0 | grad norm: 41224.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 350/ 159576 | consumed samples: 5600 | elapsed time per iteration (ms): 14148.0 | learning rate: 1.553E-06 | global batch size: 16 | lm loss: 8.069170E+00 | loss scale: 4096.0 | grad norm: 61139.413 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 351/ 159576 | consumed samples: 5616 | elapsed time per iteration (ms): 13626.0 | learning rate: 1.558E-06 | global batch size: 16 | lm loss: 8.052499E+00 | loss scale: 4096.0 | grad norm: 58965.426 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 352/ 159576 | consumed samples: 5632 | elapsed time per iteration (ms): 13633.5 | learning rate: 1.562E-06 | global batch size: 16 | lm loss: 8.036291E+00 | loss scale: 4096.0 | grad norm: 38820.487 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 353/ 159576 | consumed samples: 5648 | elapsed time per iteration (ms): 13648.6 | learning rate: 1.567E-06 | global batch size: 16 | lm loss: 8.007360E+00 | loss scale: 4096.0 | grad norm: 33342.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 354/ 159576 | consumed samples: 5664 | elapsed time per iteration (ms): 13707.0 | learning rate: 1.571E-06 | global batch size: 16 | lm loss: 7.890161E+00 | loss scale: 4096.0 | grad norm: 62589.896 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 355/ 159576 | consumed samples: 5680 | elapsed time per iteration (ms): 14101.4 | learning rate: 1.575E-06 | global batch size: 16 | lm loss: 8.034273E+00 | loss scale: 4096.0 | grad norm: 62100.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 356/ 159576 | consumed samples: 5696 | elapsed time per iteration (ms): 13548.4 | learning rate: 1.580E-06 | global batch size: 16 | lm loss: 7.964279E+00 | loss scale: 4096.0 | grad norm: 37283.643 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 357/ 159576 | consumed samples: 5712 | elapsed time per iteration (ms): 13655.3 | learning rate: 1.584E-06 | global batch size: 16 | lm loss: 7.882459E+00 | loss scale: 4096.0 | grad norm: 36278.786 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 358/ 159576 | consumed samples: 5728 | elapsed time per iteration (ms): 13872.1 | learning rate: 1.589E-06 | global batch size: 16 | lm loss: 8.081428E+00 | loss scale: 4096.0 | grad norm: 59624.520 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 359/ 159576 | consumed samples: 5744 | elapsed time per iteration (ms): 13830.3 | learning rate: 1.593E-06 | global batch size: 16 | lm loss: 8.345490E+00 | loss scale: 4096.0 | grad norm: 101818.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 360/ 159576 | consumed samples: 5760 | elapsed time per iteration (ms): 13738.3 | learning rate: 1.598E-06 | global batch size: 16 | lm loss: 8.090802E+00 | loss scale: 4096.0 | grad norm: 37735.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 361/ 159576 | consumed samples: 5776 | elapsed time per iteration (ms): 13673.7 | learning rate: 1.602E-06 | global batch size: 16 | lm loss: 7.934822E+00 | loss scale: 4096.0 | grad norm: 35051.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 362/ 159576 | consumed samples: 5792 | elapsed time per iteration (ms): 13779.0 | learning rate: 1.607E-06 | global batch size: 16 | lm loss: 8.217977E+00 | loss scale: 4096.0 | grad norm: 81671.155 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 363/ 159576 | consumed samples: 5808 | elapsed time per iteration (ms): 14148.6 | learning rate: 1.611E-06 | global batch size: 16 | lm loss: 7.956856E+00 | loss scale: 4096.0 | grad norm: 123728.069 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 364/ 159576 | consumed samples: 5824 | elapsed time per iteration (ms): 13509.6 | learning rate: 1.615E-06 | global batch size: 16 | lm loss: 7.980748E+00 | loss scale: 4096.0 | grad norm: 64323.538 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 365/ 159576 | consumed samples: 5840 | elapsed time per iteration (ms): 13791.1 | learning rate: 1.620E-06 | global batch size: 16 | lm loss: 7.927495E+00 | loss scale: 4096.0 | grad norm: 38595.229 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 366/ 159576 | consumed samples: 5856 | elapsed time per iteration (ms): 13535.8 | learning rate: 1.624E-06 | global batch size: 16 | lm loss: 7.992770E+00 | loss scale: 4096.0 | grad norm: 34786.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 367/ 159576 | consumed samples: 5872 | elapsed time per iteration (ms): 13709.6 | learning rate: 1.629E-06 | global batch size: 16 | lm loss: 8.033854E+00 | loss scale: 4096.0 | grad norm: 26681.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 368/ 159576 | consumed samples: 5888 | elapsed time per iteration (ms): 13923.8 | learning rate: 1.633E-06 | global batch size: 16 | lm loss: 8.086361E+00 | loss scale: 4096.0 | grad norm: 116063.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 369/ 159576 | consumed samples: 5904 | elapsed time per iteration (ms): 13743.2 | learning rate: 1.638E-06 | global batch size: 16 | lm loss: 8.136069E+00 | loss scale: 4096.0 | grad norm: 192843.981 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 370/ 159576 | consumed samples: 5920 | elapsed time per iteration (ms): 13586.5 | learning rate: 1.642E-06 | global batch size: 16 | lm loss: 8.213842E+00 | loss scale: 4096.0 | grad norm: 66749.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 371/ 159576 | consumed samples: 5936 | elapsed time per iteration (ms): 13637.5 | learning rate: 1.646E-06 | global batch size: 16 | lm loss: 7.862526E+00 | loss scale: 4096.0 | grad norm: 35628.877 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 372/ 159576 | consumed samples: 5952 | elapsed time per iteration (ms): 14269.3 | learning rate: 1.651E-06 | global batch size: 16 | lm loss: 8.111351E+00 | loss scale: 4096.0 | grad norm: 51284.654 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 373/ 159576 | consumed samples: 5968 | elapsed time per iteration (ms): 13424.8 | learning rate: 1.655E-06 | global batch size: 16 | lm loss: 7.860275E+00 | loss scale: 4096.0 | grad norm: 51885.287 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 374/ 159576 | consumed samples: 5984 | elapsed time per iteration (ms): 13638.9 | learning rate: 1.660E-06 | global batch size: 16 | lm loss: 7.995843E+00 | loss scale: 4096.0 | grad norm: 40982.716 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 375/ 159576 | consumed samples: 6000 | elapsed time per iteration (ms): 13719.8 | learning rate: 1.664E-06 | global batch size: 16 | lm loss: 7.989121E+00 | loss scale: 4096.0 | grad norm: 43694.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 376/ 159576 | consumed samples: 6016 | elapsed time per iteration (ms): 13718.2 | learning rate: 1.669E-06 | global batch size: 16 | lm loss: 8.054690E+00 | loss scale: 4096.0 | grad norm: 56142.201 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 377/ 159576 | consumed samples: 6032 | elapsed time per iteration (ms): 14087.0 | learning rate: 1.673E-06 | global batch size: 16 | lm loss: 8.145277E+00 | loss scale: 4096.0 | grad norm: 77837.877 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 378/ 159576 | consumed samples: 6048 | elapsed time per iteration (ms): 13621.7 | learning rate: 1.678E-06 | global batch size: 16 | lm loss: 7.879861E+00 | loss scale: 4096.0 | grad norm: 35054.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 379/ 159576 | consumed samples: 6064 | elapsed time per iteration (ms): 13676.7 | learning rate: 1.682E-06 | global batch size: 16 | lm loss: 7.996103E+00 | loss scale: 4096.0 | grad norm: 31871.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 380/ 159576 | consumed samples: 6080 | elapsed time per iteration (ms): 13756.2 | learning rate: 1.686E-06 | global batch size: 16 | lm loss: 7.788074E+00 | loss scale: 4096.0 | grad norm: 30378.507 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 381/ 159576 | consumed samples: 6096 | elapsed time per iteration (ms): 13731.7 | learning rate: 1.691E-06 | global batch size: 16 | lm loss: 7.998044E+00 | loss scale: 4096.0 | grad norm: 78167.228 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 382/ 159576 | consumed samples: 6112 | elapsed time per iteration (ms): 13696.8 | learning rate: 1.695E-06 | global batch size: 16 | lm loss: 8.001510E+00 | loss scale: 4096.0 | grad norm: 57981.800 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 383/ 159576 | consumed samples: 6128 | elapsed time per iteration (ms): 13688.0 | learning rate: 1.700E-06 | global batch size: 16 | lm loss: 8.043833E+00 | loss scale: 4096.0 | grad norm: 40631.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 384/ 159576 | consumed samples: 6144 | elapsed time per iteration (ms): 13680.4 | learning rate: 1.704E-06 | global batch size: 16 | lm loss: 8.029270E+00 | loss scale: 4096.0 | grad norm: 31579.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 385/ 159576 | consumed samples: 6160 | elapsed time per iteration (ms): 14057.5 | learning rate: 1.709E-06 | global batch size: 16 | lm loss: 8.156369E+00 | loss scale: 4096.0 | grad norm: 87842.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 386/ 159576 | consumed samples: 6176 | elapsed time per iteration (ms): 13765.1 | learning rate: 1.713E-06 | global batch size: 16 | lm loss: 8.024692E+00 | loss scale: 4096.0 | grad norm: 56881.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 387/ 159576 | consumed samples: 6192 | elapsed time per iteration (ms): 13768.8 | learning rate: 1.717E-06 | global batch size: 16 | lm loss: 7.997876E+00 | loss scale: 4096.0 | grad norm: 31105.819 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 388/ 159576 | consumed samples: 6208 | elapsed time per iteration (ms): 13433.5 | learning rate: 1.722E-06 | global batch size: 16 | lm loss: 7.985063E+00 | loss scale: 4096.0 | grad norm: 78090.353 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 389/ 159576 | consumed samples: 6224 | elapsed time per iteration (ms): 13675.2 | learning rate: 1.726E-06 | global batch size: 16 | lm loss: 7.926050E+00 | loss scale: 4096.0 | grad norm: 61534.683 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 390/ 159576 | consumed samples: 6240 | elapsed time per iteration (ms): 13989.4 | learning rate: 1.731E-06 | global batch size: 16 | lm loss: 7.938218E+00 | loss scale: 4096.0 | grad norm: 37749.344 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 391/ 159576 | consumed samples: 6256 | elapsed time per iteration (ms): 13663.4 | learning rate: 1.735E-06 | global batch size: 16 | lm loss: 7.835842E+00 | loss scale: 4096.0 | grad norm: 48700.287 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 392/ 159576 | consumed samples: 6272 | elapsed time per iteration (ms): 13682.5 | learning rate: 1.740E-06 | global batch size: 16 | lm loss: 7.976984E+00 | loss scale: 4096.0 | grad norm: 45273.731 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 393/ 159576 | consumed samples: 6288 | elapsed time per iteration (ms): 13680.3 | learning rate: 1.744E-06 | global batch size: 16 | lm loss: 8.063533E+00 | loss scale: 4096.0 | grad norm: 62966.350 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 394/ 159576 | consumed samples: 6304 | elapsed time per iteration (ms): 14158.6 | learning rate: 1.749E-06 | global batch size: 16 | lm loss: 7.962408E+00 | loss scale: 4096.0 | grad norm: 38917.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 395/ 159576 | consumed samples: 6320 | elapsed time per iteration (ms): 13412.3 | learning rate: 1.753E-06 | global batch size: 16 | lm loss: 7.930057E+00 | loss scale: 4096.0 | grad norm: 59046.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 396/ 159576 | consumed samples: 6336 | elapsed time per iteration (ms): 13631.9 | learning rate: 1.757E-06 | global batch size: 16 | lm loss: 8.137497E+00 | loss scale: 4096.0 | grad norm: 51299.741 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 397/ 159576 | consumed samples: 6352 | elapsed time per iteration (ms): 13706.0 | learning rate: 1.762E-06 | global batch size: 16 | lm loss: 8.020626E+00 | loss scale: 4096.0 | grad norm: 37056.313 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 398/ 159576 | consumed samples: 6368 | elapsed time per iteration (ms): 14158.0 | learning rate: 1.766E-06 | global batch size: 16 | lm loss: 8.114269E+00 | loss scale: 4096.0 | grad norm: 64105.827 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 399/ 159576 | consumed samples: 6384 | elapsed time per iteration (ms): 13628.9 | learning rate: 1.771E-06 | global batch size: 16 | lm loss: 8.186448E+00 | loss scale: 4096.0 | grad norm: 55633.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 400/ 159576 | consumed samples: 6400 | elapsed time per iteration (ms): 13727.5 | learning rate: 1.775E-06 | global batch size: 16 | lm loss: 8.182411E+00 | loss scale: 4096.0 | grad norm: 51312.945 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 401/ 159576 | consumed samples: 6416 | elapsed time per iteration (ms): 13749.7 | learning rate: 1.780E-06 | global batch size: 16 | lm loss: 8.020710E+00 | loss scale: 4096.0 | grad norm: 32983.756 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 402/ 159576 | consumed samples: 6432 | elapsed time per iteration (ms): 13473.4 | learning rate: 1.784E-06 | global batch size: 16 | lm loss: 7.970335E+00 | loss scale: 4096.0 | grad norm: 70699.597 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 403/ 159576 | consumed samples: 6448 | elapsed time per iteration (ms): 13904.7 | learning rate: 1.788E-06 | global batch size: 16 | lm loss: 7.993033E+00 | loss scale: 4096.0 | grad norm: 67107.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 404/ 159576 | consumed samples: 6464 | elapsed time per iteration (ms): 13683.9 | learning rate: 1.793E-06 | global batch size: 16 | lm loss: 8.091874E+00 | loss scale: 4096.0 | grad norm: 26716.683 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 405/ 159576 | consumed samples: 6480 | elapsed time per iteration (ms): 13642.3 | learning rate: 1.797E-06 | global batch size: 16 | lm loss: 8.088682E+00 | loss scale: 4096.0 | grad norm: 74507.909 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 406/ 159576 | consumed samples: 6496 | elapsed time per iteration (ms): 13688.7 | learning rate: 1.802E-06 | global batch size: 16 | lm loss: 8.134460E+00 | loss scale: 4096.0 | grad norm: 64155.050 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 407/ 159576 | consumed samples: 6512 | elapsed time per iteration (ms): 14175.7 | learning rate: 1.806E-06 | global batch size: 16 | lm loss: 8.105555E+00 | loss scale: 4096.0 | grad norm: 39464.479 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 408/ 159576 | consumed samples: 6528 | elapsed time per iteration (ms): 13703.7 | learning rate: 1.811E-06 | global batch size: 16 | lm loss: 7.988219E+00 | loss scale: 4096.0 | grad norm: 39779.639 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 409/ 159576 | consumed samples: 6544 | elapsed time per iteration (ms): 13499.5 | learning rate: 1.815E-06 | global batch size: 16 | lm loss: 7.931721E+00 | loss scale: 4096.0 | grad norm: 46421.169 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 410/ 159576 | consumed samples: 6560 | elapsed time per iteration (ms): 13608.5 | learning rate: 1.820E-06 | global batch size: 16 | lm loss: 7.944845E+00 | loss scale: 4096.0 | grad norm: 28537.165 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 411/ 159576 | consumed samples: 6576 | elapsed time per iteration (ms): 14088.6 | learning rate: 1.824E-06 | global batch size: 16 | lm loss: 7.955441E+00 | loss scale: 4096.0 | grad norm: 68818.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 412/ 159576 | consumed samples: 6592 | elapsed time per iteration (ms): 13613.5 | learning rate: 1.828E-06 | global batch size: 16 | lm loss: 8.293702E+00 | loss scale: 4096.0 | grad norm: 73315.445 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 413/ 159576 | consumed samples: 6608 | elapsed time per iteration (ms): 13670.1 | learning rate: 1.833E-06 | global batch size: 16 | lm loss: 7.982622E+00 | loss scale: 4096.0 | grad norm: 40882.033 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 414/ 159576 | consumed samples: 6624 | elapsed time per iteration (ms): 13753.2 | learning rate: 1.837E-06 | global batch size: 16 | lm loss: 7.981937E+00 | loss scale: 4096.0 | grad norm: 34929.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 415/ 159576 | consumed samples: 6640 | elapsed time per iteration (ms): 13749.7 | learning rate: 1.842E-06 | global batch size: 16 | lm loss: 8.060836E+00 | loss scale: 4096.0 | grad norm: 47572.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 416/ 159576 | consumed samples: 6656 | elapsed time per iteration (ms): 13758.6 | learning rate: 1.846E-06 | global batch size: 16 | lm loss: 8.002974E+00 | loss scale: 4096.0 | grad norm: 37872.224 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 417/ 159576 | consumed samples: 6672 | elapsed time per iteration (ms): 13599.2 | learning rate: 1.851E-06 | global batch size: 16 | lm loss: 7.972270E+00 | loss scale: 4096.0 | grad norm: 44233.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 418/ 159576 | consumed samples: 6688 | elapsed time per iteration (ms): 13571.0 | learning rate: 1.855E-06 | global batch size: 16 | lm loss: 8.249717E+00 | loss scale: 4096.0 | grad norm: 60770.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 419/ 159576 | consumed samples: 6704 | elapsed time per iteration (ms): 13598.5 | learning rate: 1.859E-06 | global batch size: 16 | lm loss: 7.861569E+00 | loss scale: 4096.0 | grad norm: 31277.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 420/ 159576 | consumed samples: 6720 | elapsed time per iteration (ms): 14077.1 | learning rate: 1.864E-06 | global batch size: 16 | lm loss: 7.965170E+00 | loss scale: 4096.0 | grad norm: 72793.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 421/ 159576 | consumed samples: 6736 | elapsed time per iteration (ms): 13383.0 | learning rate: 1.868E-06 | global batch size: 16 | lm loss: 7.907632E+00 | loss scale: 4096.0 | grad norm: 60405.796 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 422/ 159576 | consumed samples: 6752 | elapsed time per iteration (ms): 13739.1 | learning rate: 1.873E-06 | global batch size: 16 | lm loss: 8.041030E+00 | loss scale: 4096.0 | grad norm: 49156.237 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 423/ 159576 | consumed samples: 6768 | elapsed time per iteration (ms): 13364.3 | learning rate: 1.877E-06 | global batch size: 16 | lm loss: 7.965994E+00 | loss scale: 4096.0 | grad norm: 37382.408 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 424/ 159576 | consumed samples: 6784 | elapsed time per iteration (ms): 13509.2 | learning rate: 1.882E-06 | global batch size: 16 | lm loss: 7.979969E+00 | loss scale: 4096.0 | grad norm: 30214.011 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 425/ 159576 | consumed samples: 6800 | elapsed time per iteration (ms): 13784.5 | learning rate: 1.886E-06 | global batch size: 16 | lm loss: 7.877289E+00 | loss scale: 4096.0 | grad norm: 31571.817 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 426/ 159576 | consumed samples: 6816 | elapsed time per iteration (ms): 13491.5 | learning rate: 1.891E-06 | global batch size: 16 | lm loss: 8.049381E+00 | loss scale: 4096.0 | grad norm: 61185.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 427/ 159576 | consumed samples: 6832 | elapsed time per iteration (ms): 13530.6 | learning rate: 1.895E-06 | global batch size: 16 | lm loss: 7.963693E+00 | loss scale: 4096.0 | grad norm: 45639.191 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 428/ 159576 | consumed samples: 6848 | elapsed time per iteration (ms): 13594.4 | learning rate: 1.899E-06 | global batch size: 16 | lm loss: 7.874112E+00 | loss scale: 4096.0 | grad norm: 34163.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 429/ 159576 | consumed samples: 6864 | elapsed time per iteration (ms): 14157.2 | learning rate: 1.904E-06 | global batch size: 16 | lm loss: 8.141135E+00 | loss scale: 4096.0 | grad norm: 43864.273 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 430/ 159576 | consumed samples: 6880 | elapsed time per iteration (ms): 13539.3 | learning rate: 1.908E-06 | global batch size: 16 | lm loss: 7.883408E+00 | loss scale: 4096.0 | grad norm: 38957.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 431/ 159576 | consumed samples: 6896 | elapsed time per iteration (ms): 13542.5 | learning rate: 1.913E-06 | global batch size: 16 | lm loss: 7.858832E+00 | loss scale: 4096.0 | grad norm: 26292.591 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 432/ 159576 | consumed samples: 6912 | elapsed time per iteration (ms): 13843.5 | learning rate: 1.917E-06 | global batch size: 16 | lm loss: 7.901114E+00 | loss scale: 4096.0 | grad norm: 65782.734 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 433/ 159576 | consumed samples: 6928 | elapsed time per iteration (ms): 13570.9 | learning rate: 1.922E-06 | global batch size: 16 | lm loss: 8.025250E+00 | loss scale: 4096.0 | grad norm: 99671.911 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 434/ 159576 | consumed samples: 6944 | elapsed time per iteration (ms): 13645.1 | learning rate: 1.926E-06 | global batch size: 16 | lm loss: 7.512252E+00 | loss scale: 4096.0 | grad norm: 55130.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 435/ 159576 | consumed samples: 6960 | elapsed time per iteration (ms): 13607.8 | learning rate: 1.930E-06 | global batch size: 16 | lm loss: 7.858408E+00 | loss scale: 4096.0 | grad norm: 33670.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 436/ 159576 | consumed samples: 6976 | elapsed time per iteration (ms): 13679.8 | learning rate: 1.935E-06 | global batch size: 16 | lm loss: 7.844939E+00 | loss scale: 4096.0 | grad norm: 39814.378 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 437/ 159576 | consumed samples: 6992 | elapsed time per iteration (ms): 13689.9 | learning rate: 1.939E-06 | global batch size: 16 | lm loss: 8.013271E+00 | loss scale: 4096.0 | grad norm: 62672.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 438/ 159576 | consumed samples: 7008 | elapsed time per iteration (ms): 13781.3 | learning rate: 1.944E-06 | global batch size: 16 | lm loss: 7.903483E+00 | loss scale: 4096.0 | grad norm: 41414.951 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 439/ 159576 | consumed samples: 7024 | elapsed time per iteration (ms): 13527.3 | learning rate: 1.948E-06 | global batch size: 16 | lm loss: 8.131282E+00 | loss scale: 4096.0 | grad norm: 32283.331 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 440/ 159576 | consumed samples: 7040 | elapsed time per iteration (ms): 13501.3 | learning rate: 1.953E-06 | global batch size: 16 | lm loss: 7.865626E+00 | loss scale: 4096.0 | grad norm: 35041.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 441/ 159576 | consumed samples: 7056 | elapsed time per iteration (ms): 13519.5 | learning rate: 1.957E-06 | global batch size: 16 | lm loss: 7.741554E+00 | loss scale: 4096.0 | grad norm: 36249.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 442/ 159576 | consumed samples: 7072 | elapsed time per iteration (ms): 14043.2 | learning rate: 1.962E-06 | global batch size: 16 | lm loss: 7.954229E+00 | loss scale: 4096.0 | grad norm: 73161.393 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 443/ 159576 | consumed samples: 7088 | elapsed time per iteration (ms): 13566.1 | learning rate: 1.966E-06 | global batch size: 16 | lm loss: 7.943119E+00 | loss scale: 4096.0 | grad norm: 46167.002 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 444/ 159576 | consumed samples: 7104 | elapsed time per iteration (ms): 13755.3 | learning rate: 1.970E-06 | global batch size: 16 | lm loss: 7.861948E+00 | loss scale: 4096.0 | grad norm: 37826.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 445/ 159576 | consumed samples: 7120 | elapsed time per iteration (ms): 13434.4 | learning rate: 1.975E-06 | global batch size: 16 | lm loss: 7.838496E+00 | loss scale: 4096.0 | grad norm: 56817.525 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 446/ 159576 | consumed samples: 7136 | elapsed time per iteration (ms): 13607.2 | learning rate: 1.979E-06 | global batch size: 16 | lm loss: 7.932389E+00 | loss scale: 4096.0 | grad norm: 38213.438 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 447/ 159576 | consumed samples: 7152 | elapsed time per iteration (ms): 14012.8 | learning rate: 1.984E-06 | global batch size: 16 | lm loss: 7.808257E+00 | loss scale: 4096.0 | grad norm: 37539.445 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 448/ 159576 | consumed samples: 7168 | elapsed time per iteration (ms): 13428.4 | learning rate: 1.988E-06 | global batch size: 16 | lm loss: 7.818873E+00 | loss scale: 4096.0 | grad norm: 58774.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 449/ 159576 | consumed samples: 7184 | elapsed time per iteration (ms): 13533.7 | learning rate: 1.993E-06 | global batch size: 16 | lm loss: 8.147743E+00 | loss scale: 4096.0 | grad norm: 62996.237 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 450/ 159576 | consumed samples: 7200 | elapsed time per iteration (ms): 13606.8 | learning rate: 1.997E-06 | global batch size: 16 | lm loss: 8.094215E+00 | loss scale: 4096.0 | grad norm: 28180.185 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 451/ 159576 | consumed samples: 7216 | elapsed time per iteration (ms): 14132.6 | learning rate: 2.001E-06 | global batch size: 16 | lm loss: 7.781518E+00 | loss scale: 4096.0 | grad norm: 44504.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 452/ 159576 | consumed samples: 7232 | elapsed time per iteration (ms): 13368.4 | learning rate: 2.006E-06 | global batch size: 16 | lm loss: 8.044688E+00 | loss scale: 4096.0 | grad norm: 88794.745 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 453/ 159576 | consumed samples: 7248 | elapsed time per iteration (ms): 13584.3 | learning rate: 2.010E-06 | global batch size: 16 | lm loss: 7.851390E+00 | loss scale: 4096.0 | grad norm: 63860.892 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 454/ 159576 | consumed samples: 7264 | elapsed time per iteration (ms): 13723.9 | learning rate: 2.015E-06 | global batch size: 16 | lm loss: 7.919715E+00 | loss scale: 4096.0 | grad norm: 52314.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 455/ 159576 | consumed samples: 7280 | elapsed time per iteration (ms): 13869.1 | learning rate: 2.019E-06 | global batch size: 16 | lm loss: 7.873841E+00 | loss scale: 4096.0 | grad norm: 34440.715 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 456/ 159576 | consumed samples: 7296 | elapsed time per iteration (ms): 13582.9 | learning rate: 2.024E-06 | global batch size: 16 | lm loss: 8.021425E+00 | loss scale: 4096.0 | grad norm: 38108.651 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 457/ 159576 | consumed samples: 7312 | elapsed time per iteration (ms): 13563.2 | learning rate: 2.028E-06 | global batch size: 16 | lm loss: 8.019066E+00 | loss scale: 4096.0 | grad norm: 24882.231 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 458/ 159576 | consumed samples: 7328 | elapsed time per iteration (ms): 13638.8 | learning rate: 2.033E-06 | global batch size: 16 | lm loss: 8.016552E+00 | loss scale: 4096.0 | grad norm: 20634.945 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 459/ 159576 | consumed samples: 7344 | elapsed time per iteration (ms): 13616.8 | learning rate: 2.037E-06 | global batch size: 16 | lm loss: 7.754219E+00 | loss scale: 4096.0 | grad norm: 43242.810 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 460/ 159576 | consumed samples: 7360 | elapsed time per iteration (ms): 13985.2 | learning rate: 2.041E-06 | global batch size: 16 | lm loss: 7.788671E+00 | loss scale: 4096.0 | grad norm: 38608.351 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 461/ 159576 | consumed samples: 7376 | elapsed time per iteration (ms): 13736.9 | learning rate: 2.046E-06 | global batch size: 16 | lm loss: 7.806537E+00 | loss scale: 4096.0 | grad norm: 32594.750 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 462/ 159576 | consumed samples: 7392 | elapsed time per iteration (ms): 13386.0 | learning rate: 2.050E-06 | global batch size: 16 | lm loss: 7.940393E+00 | loss scale: 4096.0 | grad norm: 27037.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 463/ 159576 | consumed samples: 7408 | elapsed time per iteration (ms): 13564.9 | learning rate: 2.055E-06 | global batch size: 16 | lm loss: 7.988055E+00 | loss scale: 4096.0 | grad norm: 27394.266 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 464/ 159576 | consumed samples: 7424 | elapsed time per iteration (ms): 14013.6 | learning rate: 2.059E-06 | global batch size: 16 | lm loss: 8.004810E+00 | loss scale: 4096.0 | grad norm: 43759.686 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 465/ 159576 | consumed samples: 7440 | elapsed time per iteration (ms): 13546.2 | learning rate: 2.064E-06 | global batch size: 16 | lm loss: 7.704327E+00 | loss scale: 4096.0 | grad norm: 30191.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 466/ 159576 | consumed samples: 7456 | elapsed time per iteration (ms): 13671.9 | learning rate: 2.068E-06 | global batch size: 16 | lm loss: 7.774131E+00 | loss scale: 4096.0 | grad norm: 26963.554 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 467/ 159576 | consumed samples: 7472 | elapsed time per iteration (ms): 13643.6 | learning rate: 2.072E-06 | global batch size: 16 | lm loss: 7.856277E+00 | loss scale: 4096.0 | grad norm: 19255.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 468/ 159576 | consumed samples: 7488 | elapsed time per iteration (ms): 13848.0 | learning rate: 2.077E-06 | global batch size: 16 | lm loss: 7.999278E+00 | loss scale: 4096.0 | grad norm: 61835.255 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 469/ 159576 | consumed samples: 7504 | elapsed time per iteration (ms): 13946.4 | learning rate: 2.081E-06 | global batch size: 16 | lm loss: 7.747583E+00 | loss scale: 4096.0 | grad norm: 42910.556 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 470/ 159576 | consumed samples: 7520 | elapsed time per iteration (ms): 13471.2 | learning rate: 2.086E-06 | global batch size: 16 | lm loss: 7.847405E+00 | loss scale: 4096.0 | grad norm: 29043.806 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 471/ 159576 | consumed samples: 7536 | elapsed time per iteration (ms): 13595.6 | learning rate: 2.090E-06 | global batch size: 16 | lm loss: 7.886540E+00 | loss scale: 4096.0 | grad norm: 22573.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 472/ 159576 | consumed samples: 7552 | elapsed time per iteration (ms): 13582.6 | learning rate: 2.095E-06 | global batch size: 16 | lm loss: 7.949501E+00 | loss scale: 4096.0 | grad norm: 81307.755 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 473/ 159576 | consumed samples: 7568 | elapsed time per iteration (ms): 13977.1 | learning rate: 2.099E-06 | global batch size: 16 | lm loss: 7.798001E+00 | loss scale: 4096.0 | grad norm: 27221.701 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 474/ 159576 | consumed samples: 7584 | elapsed time per iteration (ms): 13666.7 | learning rate: 2.104E-06 | global batch size: 16 | lm loss: 7.990824E+00 | loss scale: 4096.0 | grad norm: 50253.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 474 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-24 04:00:46,754] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step474/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 474 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 17639.87 -[exiting program after 110.0032222946485 minutes] datetime: 2021-09-24 04:00:58 -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninja .................. [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -op name ................ installed .. compatible -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -ninja .................. [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -ninja .................. [OKAY] -transformer ............ [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... async_io[NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] -ninja .................. [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer_inference .. [NO] ....... [OKAY] -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils .................. [YES] ...... [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -[OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -fused_adam ............. [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -async_io ............... [NO] ....... [NO] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninja .................. [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -sparse_attn ............ [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -op name ................ installed .. compatible -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES]utils ...... ..................[OKAY] -[YES] ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] - ....... [NO] -fused_lamb ............. [NO] ....... [OKAY] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [YES] ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer .............. quantizer[NO] ..................... [NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adam ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adam ............... [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -sparse_attn ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -async_io ...............utils [NO].................. .......[YES] [NO]...... - [OKAY] -transformer ............ [NO] ....... [OKAY] -quantizer .............. [NO] .......transformer_inference [OKAY].. -stochastic_transformer . [NO] ....... [OKAY] - [NO] ....... --------------------------------------------------[OKAY] - -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... transformer_inference[NO] -.. [NO] ....... [OKAY] -utils .................. [YES]transformer_inference ........ [OKAY][NO] - ....... quantizer .............. [OKAY][NO] - ....... [OKAY] -utils-------------------------------------------------- -.................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -ninja .................. [OKAY] -JIT compiled ops requires ninja --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizer .............. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1 -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -DeepSpeed general environment info: -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -utils ..................utils [YES].................. [YES]...... ......[OKAY] -[OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info: -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed general environment info: -JIT compiled ops requires ninja -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -async_io ............... [NO] ....... [NO] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -ninja .................. [OKAY] -transformer_inference .. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -.............. [NO]quantizer ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.-------------------------------------------------- --------------------------------------------------- - -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -JIT compiled ops requires ninja -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils ..................utils [YES] ........................ [YES][OKAY] -...... [OKAY]quantizer - .............. [NO] .......quantizer [OKAY].............. - [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY]ninja - .................. [OKAY] --------------------------------------------------- -op name ................ installed sparse_attn.. ............compatible -[NO]-------------------------------------------------- -....... [OKAY] -transformer ............cpu_adam [NO]............... .......[YES] [OKAY]...... - [OKAY] -stochastic_transformer . [NO] ....... fused_adam[OKAY] -............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... torch version1.8.1 -.................... 1.8.1torch cuda version - ............... torch cuda version11.1 -............... nvcc version11.1 -.....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -................... deepspeed info0.4.2+bc17042, bc17042, big-science -................... deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY]-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -transformer-------------------------------------------------- -............NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -[NO]-------------------------------------------------- -JIT compiled ops requires ninja - ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -JIT compiled ops requires ninja -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... transformer_inference[NO] ......... [NO][NO] -....... [OKAY] -utils .................. [YES]transformer_inference ........ [NO][OKAY] -....... [OKAY] -quantizer .............. [NO]utils ......................... [OKAY][YES] - ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -ninja .................. [OKAY] -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. .. -[NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer async_io.............. ...............[NO] [NO]....... .......[OKAY] -[NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -op name ................ installed .. compatible -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -JIT compiled ops requires ninja -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninja .................. [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -op name ................ installed .. compatible --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -cpu_adam ............... [YES] ...... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -sparse_attn ............ [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -async_io ............... [NO] ....... [NO]transformer_inference -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - .. [NO] ....... [OKAY] -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -/bin/sh: line 0: type: git: not found -quantizer ..............utils [NO].................. .......[YES] [OKAY]...... - [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version DeepSpeed general environment info:..................... 11.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch install path - deepspeed info............... ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ......['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -utils .................. [YES] ...... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -1.8.1 -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc versiontorch cuda version .................................... 11.211.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY]ninja - .................. fused_lamb[OKAY] -.............-------------------------------------------------- -[NO] op name....... ................[OKAY] -installed .. compatible --------------------------------------------------- -sparse_attn ............cpu_adam [NO]............... .......[YES] [OKAY]...... - [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] stochastic_transformer....... [OKAY]. - [NO] ....... fused_lamb[OKAY] -............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -JIT compiled ops requires ninja -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY]utils - .................. [YES] ......quantizer [OKAY].............. - [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path DeepSpeed general environment info:........... -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... torch install path0.4.2+bc17042, bc17042, big-science -...............deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc versionDeepSpeed general environment info: ..................... 11.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch install path - deepspeed info............... ................... 0.4.2+bc17042, bc17042, big-science -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed wheel compiled w. ......['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -async_io ............... [NO] ....... [NO] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer_inference .. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. --------------------------------------------------[OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -op name-------------------------------------------------- -................ NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.installed -..-------------------------------------------------- -compatibleJIT compiled ops requires ninja - --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch install path.................... ...............1.8.1 -torch cuda version ............... 11.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -nvcc version .....................torch version 11.2.................... - deepspeed install path1.8.1 -........... torch cuda version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']............... - 11.1deepspeed info - ...................nvcc version 0.4.2+bc17042, bc17042, big-science..................... - 11.2deepspeed wheel compiled w. - deepspeed install path...... ...........torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch install path torch cuda version............... ............... 11.1 -nvcc version .....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] 11.2 - -deepspeed install path torch version........... .................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']1.8.1 - -deepspeed info ...................torch cuda version 0.4.2+bc17042, bc17042, big-science............... - deepspeed wheel compiled w.11.1 -......nvcc version torch 1.8, cuda 11.1..................... - 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -async_io ............... [NO] ....... [NO] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizer .............. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -nvcc version ..................... 11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1 -nvcc version nvcc version..................... .....................11.2 -11.2 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -fused_adam ............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -fused_lamb ............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -sparse_attn ............ [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -transformer ............ [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] ....... [OKAY] -transformer_inferenceutils .................... [NO][YES] ............. [OKAY][OKAY] - -quantizer .............. utils[NO] ......................... [OKAY][YES] - ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -fused_adam ............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -fused_lamb ............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] .......-------------------------------------------------- [OKAY] - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -JIT compiled ops requires ninja -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizer .............. [NO] ....... [OKAY] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed general environment info: -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -sparse_attn ............ [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ...............async_io [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version .................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1 -nvcc version nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] ....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] --------------------------------------------------- -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -utils .................. [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -quantizer .............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ...............DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ...............torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch cuda version - ............... 11.1torch version - nvcc version.................... .....................1.8.1 -11.2 -torch cuda versiondeepspeed install path .......................... 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -nvcc version deepspeed info..................... ...................11.2 -0.4.2+bc17042, bc17042, big-science -deepspeed install path deepspeed wheel compiled w............ ...... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch 1.8, cuda 11.1 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed general environment info: --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -sparse_attn ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES]  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`....... [OKAY] - -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -torch cuda version ............... 11.1 -op name ................ installed .. compatible --------------------------------------------------- -nvcc version ..................... 11.2 -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer .............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -quantizerutils ................................ [NO][YES] ............. [OKAY][OKAY] - --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version .....................DeepSpeed general environment info: 11.2 -deepspeed install path - ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install pathdeepspeed info .................................. 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch version .................... 1.8.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch cuda version ............... 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.transformer_inference .. -quantizer .............. [NO] ....... [OKAY] -[NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] async_io...... [OKAY]............... - [NO] .......quantizer [NO] -.............. [NO] ....... [OKAY] ---------------------------------------------------transformer_inference -.. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -/bin/sh: line 0: type: git: not found --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [OKAY] - --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -DeepSpeed general environment info: -stochastic_transformer . [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... DeepSpeed general environment info:11.1 -nvcc version -..................... 11.2 -deepspeed install path ...........torch install path ...............['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1torch version - .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 - nvcc version11.1 -.....................nvcc version 11.2..................... - deepspeed install path11.2 -........... deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... ...................torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - .................... torch cuda version1.8.1 ............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed info deepspeed wheel compiled w.................... ......0.4.2+bc17042, bc17042, big-science -torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch version .................... ...............1.8.1 -torch cuda version ............... 11.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -nvcc version .....................torch version 11.2.................... - deepspeed install path1.8.1 -........... torch cuda version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -............... deepspeed info11.1 -................... nvcc version0.4.2+bc17042, bc17042, big-science -..................... deepspeed wheel compiled w.11.2 ...... - deepspeed install pathtorch 1.8, cuda 11.1 -........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - .....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -................... deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path ............... -torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version .................... 1.8.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch cuda version torch version............... ....................11.1 -1.8.1nvcc version - .....................torch cuda version 11.2............... - deepspeed install path11.1 -........... nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - 11.2deepspeed info - ...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -/bin/sh: line 0: type: git: not found -[NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] utils...... [OKAY].................. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [YES] ...... quantizer[OKAY] - .............. [NO] .......quantizer [OKAY].............. -torch version .................... 1.8.1 - [NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -ninja .................. [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -op name ................ installed .. compatible -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -ninja .................. [OKAY] -torch cuda version ............... 11.1 --------------------------------------------------- -nvcc version ..................... 11.2 -op name ................ installed .. compatible --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -cpu_adam ............... [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -fused_adam ............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -sparse_attn NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op............. - --------------------------------------------------[NO] - JIT compiled ops requires ninja....... - [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -ninjaop name .................................. installed ..[OKAY] -compatible ----------------------------------------------------------------------------------------------------- - -op name ................ installed .. cpu_adamcompatible -............... --------------------------------------------------[YES] -...... [OKAY] -cpu_adam ............... [YES]fused_adam ................... [NO][OKAY] ....... - [OKAY] -fused_lamb ............. [NO] ....... fused_adam[OKAY] - ............. [NO] ....... [OKAY] -fused_lamb sparse_attn............. ............[NO] [NO] .............. [OKAY][OKAY] - -transformer ............ [NO] ....... [OKAY] -sparse_attn ............stochastic_transformer [NO] ........ [NO] [OKAY]....... - [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] .......[NO] [NO]....... - [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2DeepSpeed general environment info: -deepspeed install path -........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install pathdeepspeed info .................................. 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -quantizer utils.............. ..................[NO] [YES]....... ......[OKAY] -[OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. .............. -[NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed general environment info: -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -sparse_attn ............ [NO] ....... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version .................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.DeepSpeed general environment info: ...... torch 1.8, cuda 11.1 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -ninja-------------------------------------------------- - .................. [OKAY] --------------------------------------------------- -cpu_adamop name ............... ................[YES] installed...... ..[OKAY] -compatible --------------------------------------------------- -fused_adam cpu_adam............. [NO]............... ....... [YES][OKAY] -...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attnfused_lamb ............ .............[NO] [NO]....... .......[OKAY] - [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attnstochastic_transformer ............. [NO][NO] .............. [OKAY][OKAY] - -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> setting tensorboard ... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ...............DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch install path - ...............torch version .................... 1.8.1 -torch cuda version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -............... 11.1torch version - nvcc version.................... .....................1.8.1 -11.2 -deepspeed install pathtorch cuda version ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - deepspeed info............... ................... 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ninja installed .................... [OKAY]compatible - ----------------------------------------------------------------------------------------------------- - -op name ................ installed .. compatible -cpu_adam-------------------------------------------------- -............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adamfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn transformer............ ............[NO] [NO]....... .......[OKAY] -[OKAY] -transformer ............ stochastic_transformer[NO] ....... .[OKAY] -[NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ...... [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] ---------------------------------------------------ninja - op name .................................. [OKAY]installed - .. --------------------------------------------------compatible - ---------------------------------------------------op name - ................ installed .. compatible -cpu_adam-------------------------------------------------- -............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lambfused_adam ............. .............[NO] [NO]....... [OKAY]....... - [OKAY] -fused_lamb ............. [NO] ....... sparse_attn[OKAY] ............ - [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY]sparse_attn - ............ [NO] stochastic_transformer....... [OKAY]. - [NO] .......transformer [OKAY]............ - [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed general environment info:deepspeed install path ........... -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ...................torch install path 0.4.2+bc17042, bc17042, big-science............... - deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch versiontorch install path ................................... 1.8.1 -torch cuda version ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.1 - -nvcc version .....................torch version 11.2.................... - deepspeed install path1.8.1 -........... torch cuda version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -............... deepspeed info11.1 -................... nvcc version0.4.2+bc17042, bc17042, big-science -..................... deepspeed wheel compiled w.11.2 -......deepspeed install path torch 1.8, cuda 11.1........... - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... ...............[YES] [YES]...... ......[OKAY] -[OKAY] -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -transformertransformer ........................ [NO][NO] .............. [OKAY][OKAY] - -stochastic_transformerstochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninja .................. ..................[OKAY] -[OKAY] --------------------------------------------------- ---------------------------------------------------op name - ................ op nameinstalled .................. installedcompatible -.. --------------------------------------------------compatible - --------------------------------------------------- -cpu_adam cpu_adam............... ...............[YES] [YES]...... ...... [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -transformertransformer ........................ [NO][NO] .............. [OKAY][OKAY] - -stochastic_transformer stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -fused_adamop name ............................. [NO]installed ......... [OKAY]compatible - --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] .......fused_adam [OKAY] -............. [NO] transformer....... ............[OKAY] -[NO] ....... [OKAY]fused_lamb - ............. [NO] .......stochastic_transformer [OKAY] -. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................................................................ installedinstalledinstalled installed ...... .. compatible compatible - -compatiblecompatible---------------------------------------------------------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... ...............cpu_adam[YES] cpu_adam [YES] ...... [OKAY]............... -..................... [YES][OKAY][YES] - ............ fused_adam[OKAY][OKAY] - -............. [NO] ....... fused_adam[OKAY] -............. [NO]fused_lambfused_adam fused_adam .................... ............. .............[OKAY] -[NO] [NO] [NO] .......fused_lamb ....... .................... [OKAY][NO] [OKAY] -[OKAY]....... - - [OKAY] -fused_lamb fused_lamb............. ............. [NO][NO] .......sparse_attn....... [OKAY] ............ -[OKAY]sparse_attn -[NO] ............ .......[NO] [OKAY]....... - [OKAY] -transformer ............sparse_attntransformer sparse_attn [NO]........................ ...................[NO][NO] [NO].......[OKAY]....... - .......[OKAY][OKAY] - -stochastic_transformer[OKAY] -transformer stochastic_transformer.............transformer [NO][NO]. ............ ....... [NO]....... [NO] [OKAY] ....... -[OKAY].......[OKAY] - - [OKAY] -stochastic_transformer stochastic_transformer. [NO] ........ [NO][OKAY] - ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO] [NO]....... .......[NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -quantizer .............. utils[NO] ......................... [YES][OKAY] -...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch install path.................... 1.8.1............... - torch cuda version ............... 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']nvcc version - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - ..................... torch version11.2 -....................deepspeed install path 1.8.1........... -async_io ............... [NO] ....... [NO] - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch cuda version - ...............deepspeed info 11.1................... - 0.4.2+bc17042, bc17042, big-sciencenvcc version - deepspeed wheel compiled w...................... ......11.2 -torch 1.8, cuda 11.1deepspeed install path -transformer_inference .. [NO] ....... [OKAY] - ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -op nameop name - op name ................op name................................ ................installedinstalled installed installed .... ....compatible compatible - compatible-------------------------------------------------- -compatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... cpu_adam............... [YES]cpu_adam ............... [YES]..................... [OKAY][YES] ............ - [YES] [OKAY] [OKAY] -...... - [OKAY] -fused_adam ............. fused_adam[NO]fused_adam ....... fused_adam............. ............. [OKAY] [NO] -.............[NO] ....... fused_lamb[NO] [OKAY] ............. -....... ....... [NO] [OKAY] [OKAY] -fused_lamb....... - .............[OKAY] -[NO]fused_lambfused_lamb ................................. [OKAY][NO][NO] - .............. sparse_attn[OKAY][OKAY] - -............ [NO] ....... sparse_attn[OKAY] -............ [NO]transformer ................... [OKAY]sparse_attn[NO]sparse_attn - ...............................transformer [OKAY][NO]............[NO] - ....... [NO] .......[OKAY]stochastic_transformer - [OKAY]....... -.transformer[OKAY] transformer[NO] - ............................... stochastic_transformer[NO][OKAY] [NO] - ............... [NO] [OKAY] [OKAY] -....... - [OKAY]stochastic_transformer - stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -using world size: 256, data-parallel-size: 8, tensor-model-parallel size: 4, pipeline-model-parallel size: 8 -using torch.float16 for parameters ... ------------------------- arguments ------------------------ - accumulate_allreduce_grads_in_fp32 .............. False - adam_beta1 ...................................... 0.9 - adam_beta2 ...................................... 0.999 - adam_eps ........................................ 1e-08 - adlr_autoresume ................................. False - adlr_autoresume_interval ........................ 1000 - apply_query_key_layer_scaling ................... True - apply_residual_connection_post_layernorm ........ False - attention_dropout ............................... 0.1 - attention_softmax_in_fp32 ....................... False - bert_binary_head ................................ True - bert_load ....................................... None - bf16 ............................................ False - bias_dropout_fusion ............................. True - bias_gelu_fusion ................................ True - biencoder_projection_dim ........................ 0 - biencoder_shared_query_context_model ............ False - block_data_path ................................. None - checkpoint_activations .......................... True - checkpoint_in_cpu ............................... False - checkpoint_num_layers ........................... 1 - clip_grad ....................................... 1.0 - codecarbon_dir .................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/codecarbon - consumed_train_samples .......................... 0 - consumed_valid_samples .......................... 0 - contigious_checkpointing ........................ False - cpu_optimizer ................................... False - cpu_torch_adam .................................. False - data_impl ....................................... mmap - data_parallel_size .............................. 8 - data_path ....................................... ['/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document'] - dataloader_type ................................. single - DDP_impl ........................................ local - decoder_seq_length .............................. None - deepscale ....................................... False - deepscale_config ................................ None - deepspeed ....................................... True - deepspeed_activation_checkpointing .............. True - deepspeed_config ................................ ./ds_config.1162747.json - deepspeed_mpi ................................... False - distribute_checkpointed_activations ............. False - distributed_backend ............................. nccl - embedding_path .................................. None - encoder_seq_length .............................. 2048 - eod_mask_loss ................................... False - eval_interval ................................... 1000 - eval_iters ...................................... 5 - evidence_data_path .............................. None - exit_duration_in_mins ........................... 110 - exit_interval ................................... None - ffn_hidden_size ................................. 20480 - finetune ........................................ False - fp16 ............................................ True - fp16_lm_cross_entropy ........................... False - fp32_residual_connection ........................ False - global_batch_size ............................... 2048 - hidden_dropout .................................. 0.1 - hidden_size ..................................... 16384 - hysteresis ...................................... 2 - ict_head_size ................................... None - ict_load ........................................ None - img_dim ......................................... 224 - indexer_batch_size .............................. 128 - indexer_log_interval ............................ 1000 - init_method_std ................................. 0.02 - init_method_xavier_uniform ...................... False - initial_loss_scale .............................. 4294967296 - kv_channels ..................................... 512 - layernorm_epsilon ............................... 1e-05 - lazy_mpu_init ................................... None - load ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - local_rank ...................................... 0 - log_batch_size_to_tensorboard ................... True - log_interval .................................... 1 - log_learning_rate_to_tensorboard ................ True - log_loss_scale_to_tensorboard ................... True - log_num_zeros_in_grad ........................... False - log_params_norm ................................. False - log_timers_to_tensorboard ....................... True - log_validation_ppl_to_tensorboard ............... True - loss_scale ...................................... 12.0 - loss_scale_window ............................... 1000 - lr .............................................. 6e-05 - lr_decay_iters .................................. None - lr_decay_samples ................................ 126953125 - lr_decay_style .................................. cosine - lr_warmup_fraction .............................. None - lr_warmup_iters ................................. 0 - lr_warmup_samples ............................... 216320 - make_vocab_size_divisible_by .................... 128 - mask_prob ....................................... 0.15 - masked_softmax_fusion ........................... True - max_position_embeddings ......................... 2048 - memory_centric_tiled_linear ..................... False - merge_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt - micro_batch_size ................................ 1 - min_loss_scale .................................. 1.0 - min_lr .......................................... 6e-06 - mmap_warmup ..................................... False - no_load_optim ................................... None - no_load_rng ..................................... None - no_save_optim ................................... None - no_save_rng ..................................... None - num_attention_heads ............................. 32 - num_channels .................................... 3 - num_classes ..................................... 1000 - num_layers ...................................... 32 - num_layers_per_virtual_pipeline_stage ........... None - num_workers ..................................... 2 - onnx_safe ....................................... None - openai_gelu ..................................... False - optimizer ....................................... adam - override_lr_scheduler ........................... False - params_dtype .................................... torch.float16 - partition_activations ........................... False - patch_dim ....................................... 16 - pipeline_model_parallel_size .................... 8 - position_embedding_type ......................... PositionEmbeddingType.absolute - profile_backward ................................ False - query_in_block_prob ............................. 0.1 - rampup_batch_size ............................... ['16', '16', '6_000_000'] - rank ............................................ 0 - remote_device ................................... none - reset_attention_mask ............................ False - reset_position_ids .............................. False - retriever_report_topk_accuracies ................ [] - retriever_score_scaling ......................... False - retriever_seq_length ............................ 256 - sample_rate ..................................... 1.0 - save ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - save_interval ................................... 1500 - scatter_gather_tensors_in_pipeline .............. True - scattered_embeddings ............................ False - seed ............................................ 42 - seq_length ...................................... 2048 - sgd_momentum .................................... 0.9 - short_seq_prob .................................. 0.1 - split ........................................... 949,50,1 - split_transformers .............................. False - synchronize_each_layer .......................... False - tensor_model_parallel_size ...................... 4 - tensorboard_dir ................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/tensorboard - tensorboard_log_interval ........................ 1 - tensorboard_queue_size .......................... 5 - tile_factor ..................................... 1 - titles_data_path ................................ None - tokenizer_name_or_path .......................... None - tokenizer_type .................................. GPT2BPETokenizer - train_iters ..................................... None - train_samples ................................... 300000000 - use_checkpoint_lr_scheduler ..................... False - use_contiguous_buffers_in_ddp ................... False - use_cpu_initialization .......................... None - use_one_sent_docs ............................... False - use_pin_memory .................................. False - virtual_pipeline_model_parallel_size ............ None - vocab_extra_ids ................................. 0 - vocab_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json - weight_decay .................................... 0.1 - world_size ...................................... 256 - zero_allgather_bucket_size ...................... 0.0 - zero_contigious_gradients ....................... False - zero_reduce_bucket_size ......................... 0.0 - zero_reduce_scatter ............................. False - zero_stage ...................................... 1 --------------------- end of arguments --------------------- -will use batch size rampup starting from global batch size 16 to global batch size 2048 with batch size increments 16 over 6000000 samples. -> building GPT2BPETokenizer tokenizer ... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference transformer_inference.. ..[NO] .......[NO] [OKAY]....... - [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO] .......transformer_inference ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY]quantizer - .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -------------------------------------------------------------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name op nameop name ................ ................ ................ installed ................installedinstalled ..installed .. .. ..compatible compatible -compatible -compatible ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- - -cpu_adamcpu_adam cpu_adamcpu_adam ............... ............... .............................. [YES][YES][YES] ......[YES] ............ [OKAY] -......[OKAY] -[OKAY] -[OKAY] -fused_adam .............fused_adam [NO]fused_adam............. ....................fused_adam [NO] [OKAY] [NO] -....... .......[OKAY].............fused_lamb - [OKAY][NO]............. -fused_lamb [NO]............. fused_lamb ....... ....... [NO]............. [OKAY] [OKAY] -.......[NO] [OKAY]....... - - [OKAY] -sparse_attn fused_lamb............ sparse_attn.............[NO] sparse_attn[NO]................... ............[OKAY][NO] - [NO]..............transformer .......[OKAY]............ - [OKAY][NO] - .......[OKAY] -transformer[OKAY]transformer - ........................ [NO][NO] stochastic_transformer ....... ....... [OKAY][OKAY]. - - [NO] .......stochastic_transformerstochastic_transformer [OKAY] -.sparse_attn . [NO] ............ [NO] ....... .......[OKAY] -[NO][OKAY] -....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... torch install path1.8.1 -............... torch cuda version ............... 11.1 -nvcc version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -..................... 11.2torch version - deepspeed install path.................... ...........1.8.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version deepspeed info............... ...................11.1 -nvcc version0.4.2+bc17042, bc17042, big-science -.....................deepspeed wheel compiled w. 11.2...... - deepspeed install pathtorch 1.8, cuda 11.1 -........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. async_io............... -[NO]............... .......[NO] [NO]....... - [NO] -async_iotransformer_inference ................. transformer_inference[NO][NO] ................ [NO][NO][OKAY] - -....... [OKAY] -utils .................. utils[YES] transformer_inference .................. ...... ..[YES][OKAY] -[NO]...... .......[OKAY] -quantizer[OKAY] - .............. quantizer[NO] .....................utils [NO][OKAY].................. - .......[YES] [OKAY] ---------------------------------------------------...... - [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> setting codecarbon ... -> initializing torch distributed ... -> initializing tensor model parallel with size 4 -> initializing pipeline model parallel with size 8 -> setting random seeds to 42 ... -[2021-09-24 04:01:23,432] [INFO] [checkpointing.py:226:model_parallel_cuda_manual_seed] > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 2760 and data parallel seed: 42 -> compiling dataset index builder ... -make: Entering directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' -make: Nothing to be done for 'default'. -make: Leaving directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' ->>> done with dataset index builder. Compilation time: 0.299 seconds -> compiling and loading fused kernels ... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_upper_triang_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_upper_triang_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module fused_mix_prec_layer_norm_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module fused_mix_prec_layer_norm_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( ->>> done with compiling and loading fused kernels. Compilation time: 17.207 seconds -time to initialize megatron (seconds): 4.980 -[after megatron is initialized] datetime: 2021-09-24 04:01:40 -building GPT model ... -[2021-09-24 04:01:41,035] [INFO] [utils.py:680:see_memory_usage] Before Building Model -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:373: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved - warnings.warn( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:381: FutureWarning: torch.cuda.max_memory_cached has been renamed to torch.cuda.max_memory_reserved - warnings.warn( -[2021-09-24 04:01:41,037] [INFO] [utils.py:681:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB -[2021-09-24 04:01:41,037] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 37.36 GB, percent = 20.0% -SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None -Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=1, data=0, model=0): 32, ProcessCoord(pipe=1, data=0, model=1): 33, ProcessCoord(pipe=1, data=0, model=2): 34, ProcessCoord(pipe=1, data=0, model=3): 35, ProcessCoord(pipe=1, data=1, model=0): 36, ProcessCoord(pipe=1, data=1, model=1): 37, ProcessCoord(pipe=1, data=1, model=2): 38, ProcessCoord(pipe=1, data=1, model=3): 39, ProcessCoord(pipe=1, data=2, model=0): 40, ProcessCoord(pipe=1, data=2, model=1): 41, ProcessCoord(pipe=1, data=2, model=2): 42, ProcessCoord(pipe=1, data=2, model=3): 43, ProcessCoord(pipe=1, data=3, model=0): 44, ProcessCoord(pipe=1, data=3, model=1): 45, ProcessCoord(pipe=1, data=3, model=2): 46, ProcessCoord(pipe=1, data=3, model=3): 47, ProcessCoord(pipe=1, data=4, model=0): 48, ProcessCoord(pipe=1, data=4, model=1): 49, ProcessCoord(pipe=1, data=4, model=2): 50, ProcessCoord(pipe=1, data=4, model=3): 51, ProcessCoord(pipe=1, data=5, model=0): 52, ProcessCoord(pipe=1, data=5, model=1): 53, ProcessCoord(pipe=1, data=5, model=2): 54, ProcessCoord(pipe=1, data=5, model=3): 55, ProcessCoord(pipe=1, data=6, model=0): 56, ProcessCoord(pipe=1, data=6, model=1): 57, ProcessCoord(pipe=1, data=6, model=2): 58, ProcessCoord(pipe=1, data=6, model=3): 59, ProcessCoord(pipe=1, data=7, model=0): 60, ProcessCoord(pipe=1, data=7, model=1): 61, ProcessCoord(pipe=1, data=7, model=2): 62, ProcessCoord(pipe=1, data=7, model=3): 63, ProcessCoord(pipe=2, data=0, model=0): 64, ProcessCoord(pipe=2, data=0, model=1): 65, ProcessCoord(pipe=2, data=0, model=2): 66, ProcessCoord(pipe=2, data=0, model=3): 67, ProcessCoord(pipe=2, data=1, model=0): 68, ProcessCoord(pipe=2, data=1, model=1): 69, ProcessCoord(pipe=2, data=1, model=2): 70, ProcessCoord(pipe=2, data=1, model=3): 71, ProcessCoord(pipe=2, data=2, model=0): 72, ProcessCoord(pipe=2, data=2, model=1): 73, ProcessCoord(pipe=2, data=2, model=2): 74, ProcessCoord(pipe=2, data=2, model=3): 75, ProcessCoord(pipe=2, data=3, model=0): 76, ProcessCoord(pipe=2, data=3, model=1): 77, ProcessCoord(pipe=2, data=3, model=2): 78, ProcessCoord(pipe=2, data=3, model=3): 79, ProcessCoord(pipe=2, data=4, model=0): 80, ProcessCoord(pipe=2, data=4, model=1): 81, ProcessCoord(pipe=2, data=4, model=2): 82, ProcessCoord(pipe=2, data=4, model=3): 83, ProcessCoord(pipe=2, data=5, model=0): 84, ProcessCoord(pipe=2, data=5, model=1): 85, ProcessCoord(pipe=2, data=5, model=2): 86, ProcessCoord(pipe=2, data=5, model=3): 87, ProcessCoord(pipe=2, data=6, model=0): 88, ProcessCoord(pipe=2, data=6, model=1): 89, ProcessCoord(pipe=2, data=6, model=2): 90, ProcessCoord(pipe=2, data=6, model=3): 91, ProcessCoord(pipe=2, data=7, model=0): 92, ProcessCoord(pipe=2, data=7, model=1): 93, ProcessCoord(pipe=2, data=7, model=2): 94, ProcessCoord(pipe=2, data=7, model=3): 95, ProcessCoord(pipe=3, data=0, model=0): 96, ProcessCoord(pipe=3, data=0, model=1): 97, ProcessCoord(pipe=3, data=0, model=2): 98, ProcessCoord(pipe=3, data=0, model=3): 99, ProcessCoord(pipe=3, data=1, model=0): 100, ProcessCoord(pipe=3, data=1, model=1): 101, ProcessCoord(pipe=3, data=1, model=2): 102, ProcessCoord(pipe=3, data=1, model=3): 103, ProcessCoord(pipe=3, data=2, model=0): 104, ProcessCoord(pipe=3, data=2, model=1): 105, ProcessCoord(pipe=3, data=2, model=2): 106, ProcessCoord(pipe=3, data=2, model=3): 107, ProcessCoord(pipe=3, data=3, model=0): 108, ProcessCoord(pipe=3, data=3, model=1): 109, ProcessCoord(pipe=3, data=3, model=2): 110, ProcessCoord(pipe=3, data=3, model=3): 111, ProcessCoord(pipe=3, data=4, model=0): 112, ProcessCoord(pipe=3, data=4, model=1): 113, ProcessCoord(pipe=3, data=4, model=2): 114, ProcessCoord(pipe=3, data=4, model=3): 115, ProcessCoord(pipe=3, data=5, model=0): 116, ProcessCoord(pipe=3, data=5, model=1): 117, ProcessCoord(pipe=3, data=5, model=2): 118, ProcessCoord(pipe=3, data=5, model=3): 119, ProcessCoord(pipe=3, data=6, model=0): 120, ProcessCoord(pipe=3, data=6, model=1): 121, ProcessCoord(pipe=3, data=6, model=2): 122, ProcessCoord(pipe=3, data=6, model=3): 123, ProcessCoord(pipe=3, data=7, model=0): 124, ProcessCoord(pipe=3, data=7, model=1): 125, ProcessCoord(pipe=3, data=7, model=2): 126, ProcessCoord(pipe=3, data=7, model=3): 127, ProcessCoord(pipe=4, data=0, model=0): 128, ProcessCoord(pipe=4, data=0, model=1): 129, ProcessCoord(pipe=4, data=0, model=2): 130, ProcessCoord(pipe=4, data=0, model=3): 131, ProcessCoord(pipe=4, data=1, model=0): 132, ProcessCoord(pipe=4, data=1, model=1): 133, ProcessCoord(pipe=4, data=1, model=2): 134, ProcessCoord(pipe=4, data=1, model=3): 135, ProcessCoord(pipe=4, data=2, model=0): 136, ProcessCoord(pipe=4, data=2, model=1): 137, ProcessCoord(pipe=4, data=2, model=2): 138, ProcessCoord(pipe=4, data=2, model=3): 139, ProcessCoord(pipe=4, data=3, model=0): 140, ProcessCoord(pipe=4, data=3, model=1): 141, ProcessCoord(pipe=4, data=3, model=2): 142, ProcessCoord(pipe=4, data=3, model=3): 143, ProcessCoord(pipe=4, data=4, model=0): 144, ProcessCoord(pipe=4, data=4, model=1): 145, ProcessCoord(pipe=4, data=4, model=2): 146, ProcessCoord(pipe=4, data=4, model=3): 147, ProcessCoord(pipe=4, data=5, model=0): 148, ProcessCoord(pipe=4, data=5, model=1): 149, ProcessCoord(pipe=4, data=5, model=2): 150, ProcessCoord(pipe=4, data=5, model=3): 151, ProcessCoord(pipe=4, data=6, model=0): 152, ProcessCoord(pipe=4, data=6, model=1): 153, ProcessCoord(pipe=4, data=6, model=2): 154, ProcessCoord(pipe=4, data=6, model=3): 155, ProcessCoord(pipe=4, data=7, model=0): 156, ProcessCoord(pipe=4, data=7, model=1): 157, ProcessCoord(pipe=4, data=7, model=2): 158, ProcessCoord(pipe=4, data=7, model=3): 159, ProcessCoord(pipe=5, data=0, model=0): 160, ProcessCoord(pipe=5, data=0, model=1): 161, ProcessCoord(pipe=5, data=0, model=2): 162, ProcessCoord(pipe=5, data=0, model=3): 163, ProcessCoord(pipe=5, data=1, model=0): 164, ProcessCoord(pipe=5, data=1, model=1): 165, ProcessCoord(pipe=5, data=1, model=2): 166, ProcessCoord(pipe=5, data=1, model=3): 167, ProcessCoord(pipe=5, data=2, model=0): 168, ProcessCoord(pipe=5, data=2, model=1): 169, ProcessCoord(pipe=5, data=2, model=2): 170, ProcessCoord(pipe=5, data=2, model=3): 171, ProcessCoord(pipe=5, data=3, model=0): 172, ProcessCoord(pipe=5, data=3, model=1): 173, ProcessCoord(pipe=5, data=3, model=2): 174, ProcessCoord(pipe=5, data=3, model=3): 175, ProcessCoord(pipe=5, data=4, model=0): 176, ProcessCoord(pipe=5, data=4, model=1): 177, ProcessCoord(pipe=5, data=4, model=2): 178, ProcessCoord(pipe=5, data=4, model=3): 179, ProcessCoord(pipe=5, data=5, model=0): 180, ProcessCoord(pipe=5, data=5, model=1): 181, ProcessCoord(pipe=5, data=5, model=2): 182, ProcessCoord(pipe=5, data=5, model=3): 183, ProcessCoord(pipe=5, data=6, model=0): 184, ProcessCoord(pipe=5, data=6, model=1): 185, ProcessCoord(pipe=5, data=6, model=2): 186, ProcessCoord(pipe=5, data=6, model=3): 187, ProcessCoord(pipe=5, data=7, model=0): 188, ProcessCoord(pipe=5, data=7, model=1): 189, ProcessCoord(pipe=5, data=7, model=2): 190, ProcessCoord(pipe=5, data=7, model=3): 191, ProcessCoord(pipe=6, data=0, model=0): 192, ProcessCoord(pipe=6, data=0, model=1): 193, ProcessCoord(pipe=6, data=0, model=2): 194, ProcessCoord(pipe=6, data=0, model=3): 195, ProcessCoord(pipe=6, data=1, model=0): 196, ProcessCoord(pipe=6, data=1, model=1): 197, ProcessCoord(pipe=6, data=1, model=2): 198, ProcessCoord(pipe=6, data=1, model=3): 199, ProcessCoord(pipe=6, data=2, model=0): 200, ProcessCoord(pipe=6, data=2, model=1): 201, ProcessCoord(pipe=6, data=2, model=2): 202, ProcessCoord(pipe=6, data=2, model=3): 203, ProcessCoord(pipe=6, data=3, model=0): 204, ProcessCoord(pipe=6, data=3, model=1): 205, ProcessCoord(pipe=6, data=3, model=2): 206, ProcessCoord(pipe=6, data=3, model=3): 207, ProcessCoord(pipe=6, data=4, model=0): 208, ProcessCoord(pipe=6, data=4, model=1): 209, ProcessCoord(pipe=6, data=4, model=2): 210, ProcessCoord(pipe=6, data=4, model=3): 211, ProcessCoord(pipe=6, data=5, model=0): 212, ProcessCoord(pipe=6, data=5, model=1): 213, ProcessCoord(pipe=6, data=5, model=2): 214, ProcessCoord(pipe=6, data=5, model=3): 215, ProcessCoord(pipe=6, data=6, model=0): 216, ProcessCoord(pipe=6, data=6, model=1): 217, ProcessCoord(pipe=6, data=6, model=2): 218, ProcessCoord(pipe=6, data=6, model=3): 219, ProcessCoord(pipe=6, data=7, model=0): 220, ProcessCoord(pipe=6, data=7, model=1): 221, ProcessCoord(pipe=6, data=7, model=2): 222, ProcessCoord(pipe=6, data=7, model=3): 223, ProcessCoord(pipe=7, data=0, model=0): 224, ProcessCoord(pipe=7, data=0, model=1): 225, ProcessCoord(pipe=7, data=0, model=2): 226, ProcessCoord(pipe=7, data=0, model=3): 227, ProcessCoord(pipe=7, data=1, model=0): 228, ProcessCoord(pipe=7, data=1, model=1): 229, ProcessCoord(pipe=7, data=1, model=2): 230, ProcessCoord(pipe=7, data=1, model=3): 231, ProcessCoord(pipe=7, data=2, model=0): 232, ProcessCoord(pipe=7, data=2, model=1): 233, ProcessCoord(pipe=7, data=2, model=2): 234, ProcessCoord(pipe=7, data=2, model=3): 235, ProcessCoord(pipe=7, data=3, model=0): 236, ProcessCoord(pipe=7, data=3, model=1): 237, ProcessCoord(pipe=7, data=3, model=2): 238, ProcessCoord(pipe=7, data=3, model=3): 239, ProcessCoord(pipe=7, data=4, model=0): 240, ProcessCoord(pipe=7, data=4, model=1): 241, ProcessCoord(pipe=7, data=4, model=2): 242, ProcessCoord(pipe=7, data=4, model=3): 243, ProcessCoord(pipe=7, data=5, model=0): 244, ProcessCoord(pipe=7, data=5, model=1): 245, ProcessCoord(pipe=7, data=5, model=2): 246, ProcessCoord(pipe=7, data=5, model=3): 247, ProcessCoord(pipe=7, data=6, model=0): 248, ProcessCoord(pipe=7, data=6, model=1): 249, ProcessCoord(pipe=7, data=6, model=2): 250, ProcessCoord(pipe=7, data=6, model=3): 251, ProcessCoord(pipe=7, data=7, model=0): 252, ProcessCoord(pipe=7, data=7, model=1): 253, ProcessCoord(pipe=7, data=7, model=2): 254, ProcessCoord(pipe=7, data=7, model=3): 255} -[2021-09-24 04:01:42,442] [INFO] [module.py:360:_partition_layers] Partitioning pipeline stages with method type:transformer -stage=0 layers=7 - 0: _to_float16 - 1: EmbeddingPipe - 2: - 3: ParallelTransformerLayerPipe - 4: ParallelTransformerLayerPipe - 5: ParallelTransformerLayerPipe - 6: ParallelTransformerLayerPipe -stage=1 layers=4 - 7: ParallelTransformerLayerPipe - 8: ParallelTransformerLayerPipe - 9: ParallelTransformerLayerPipe - 10: ParallelTransformerLayerPipe -stage=2 layers=4 - 11: ParallelTransformerLayerPipe - 12: ParallelTransformerLayerPipe - 13: ParallelTransformerLayerPipe - 14: ParallelTransformerLayerPipe -stage=3 layers=4 - 15: ParallelTransformerLayerPipe - 16: ParallelTransformerLayerPipe - 17: ParallelTransformerLayerPipe - 18: ParallelTransformerLayerPipe -stage=4 layers=4 - 19: ParallelTransformerLayerPipe - 20: ParallelTransformerLayerPipe - 21: ParallelTransformerLayerPipe - 22: ParallelTransformerLayerPipe -stage=5 layers=4 - 23: ParallelTransformerLayerPipe - 24: ParallelTransformerLayerPipe - 25: ParallelTransformerLayerPipe - 26: ParallelTransformerLayerPipe -stage=6 layers=4 - 27: ParallelTransformerLayerPipe - 28: ParallelTransformerLayerPipe - 29: ParallelTransformerLayerPipe - 30: ParallelTransformerLayerPipe -stage=7 layers=8 - 31: ParallelTransformerLayerPipe - 32: ParallelTransformerLayerPipe - 33: ParallelTransformerLayerPipe - 34: ParallelTransformerLayerPipe - 35: - 36: MixedFusedLayerNorm - 37: EmbeddingPipe - 38: float16_to_fp32 - loss: CrossEntropy - > number of parameters on (tensor, pipeline) model parallel rank (1, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (2, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (0, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (3, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (2, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (1, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 0): 1986465792 -[2021-09-24 04:01:43,676] [INFO] [utils.py:680:see_memory_usage] After Building Model -[2021-09-24 04:01:43,677] [INFO] [utils.py:681:see_memory_usage] MA 3.77 GB Max_MA 3.79 GB CA 3.79 GB Max_CA 4 GB -[2021-09-24 04:01:43,677] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 37.55 GB, percent = 20.1% - > number of parameters on (tensor, pipeline) model parallel rank (0, 0): 1986465792 -setting training iterations to 159576 -> learning rate decay style: cosine -DeepSpeed is enabled. -[2021-09-24 04:01:43,733] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.4.2+bc17042, git-hash=bc17042, git-branch=big-science -[2021-09-24 04:01:43,813] [INFO] [engine.py:179:__init__] DeepSpeed Flops Profiler Enabled: False -[2021-09-24 04:01:43,813] [INFO] [engine.py:736:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer -[2021-09-24 04:01:43,813] [INFO] [engine.py:741:_configure_optimizer] Using client Optimizer as basic optimizer -[2021-09-24 04:01:43,813] [INFO] [engine.py:750:_configure_optimizer] DeepSpeed Basic Optimizer = FusedAdam -[2021-09-24 04:01:43,813] [INFO] [utils.py:43:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= -[2021-09-24 04:01:43,813] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 1 optimizer -[2021-09-24 04:01:43,814] [INFO] [stage2.py:106:__init__] Reduce bucket size 500000000 -[2021-09-24 04:01:43,814] [INFO] [stage2.py:107:__init__] Allgather bucket size 500000000 -[2021-09-24 04:01:43,814] [INFO] [stage2.py:108:__init__] CPU Offload: False -[2021-09-24 04:01:43,814] [INFO] [stage2.py:109:__init__] Round robin gradient partitioning: False -[2021-09-24 04:01:48,526] [INFO] [stage2.py:419:__init__] optimizer state initialized -[2021-09-24 04:01:48,527] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam -[2021-09-24 04:01:48,527] [INFO] [engine.py:553:_configure_lr_scheduler] DeepSpeed using client LR scheduler -[2021-09-24 04:01:48,527] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = -[2021-09-24 04:01:48,527] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999)] -[2021-09-24 04:01:48,527] [INFO] [config.py:900:print] DeepSpeedEngine configuration: -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] activation_checkpointing_config { - "partition_activations": false, - "contiguous_memory_optimization": false, - "cpu_checkpointing": false, - "number_checkpoints": null, - "synchronize_checkpoint_boundary": false, - "profile": false -} -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] allreduce_always_fp32 ........ False -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] amp_enabled .................. False -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] amp_params ................... False -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] checkpoint_tag_validation_enabled True -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] checkpoint_tag_validation_fail False -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] disable_allgather ............ False -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] dump_state ................... False -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 500, 'delayed_shift': 2, 'min_scale': 1} -[2021-09-24 04:01:48,527] [INFO] [config.py:904:print] eigenvalue_enabled ........... False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] eigenvalue_gas_boundary_resolution 1 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] eigenvalue_layer_name ........ bert.encoder.layer -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] eigenvalue_layer_num ......... 0 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] eigenvalue_max_iter .......... 100 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] eigenvalue_stability ......... 1e-06 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] eigenvalue_tol ............... 0.01 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] eigenvalue_verbose ........... False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] elasticity_enabled ........... False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] flops_profiler_config ........ { - "enabled": false, - "profile_step": 1, - "module_depth": -1, - "top_modules": 1, - "detailed": true, - "output_file": null -} -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] fp16_enabled ................. True -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] fp16_mixed_quantize .......... False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] global_rank .................. 0 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] gradient_accumulation_steps .. 256 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] gradient_clipping ............ 1.0 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] gradient_predivide_factor .... 1.0 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] initial_dynamic_scale ........ 4096 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] loss_scale ................... 0 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] memory_breakdown ............. False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] optimizer_legacy_fusion ...... False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] optimizer_name ............... None -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] optimizer_params ............. None -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] pld_enabled .................. False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] pld_params ................... False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] prescale_gradients ........... False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_change_rate ......... 0.001 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_groups .............. 1 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_offset .............. 1000 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_period .............. 1000 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_rounding ............ 0 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_start_bits .......... 16 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_target_bits ......... 8 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_training_enabled .... False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_type ................ 0 -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] quantize_verbose ............. False -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] scheduler_name ............... None -[2021-09-24 04:01:48,528] [INFO] [config.py:904:print] scheduler_params ............. None -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] sparse_attention ............. None -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] sparse_gradients_enabled ..... False -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] steps_per_print .............. 2000 -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] tensorboard_enabled .......... False -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] tensorboard_job_name ......... DeepSpeedJobName -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] tensorboard_output_path ...... -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] train_batch_size ............. 2048 -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] train_micro_batch_size_per_gpu 1 -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] use_quantizer_kernel ......... False -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] wall_clock_breakdown ......... False -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] world_size ................... 8 -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] zero_allow_untested_optimizer False -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] zero_config .................. { - "stage": 1, - "contiguous_gradients": false, - "reduce_scatter": true, - "reduce_bucket_size": 5.000000e+08, - "allgather_partitions": true, - "allgather_bucket_size": 5.000000e+08, - "overlap_comm": false, - "load_from_fp32_weights": true, - "elastic_checkpoint": true, - "offload_param": null, - "offload_optimizer": null, - "sub_group_size": 1.000000e+09, - "prefetch_bucket_size": 5.000000e+07, - "param_persistence_threshold": 1.000000e+05, - "max_live_parameters": 1.000000e+09, - "max_reuse_distance": 1.000000e+09, - "gather_fp16_weights_on_model_save": false, - "ignore_unused_parameters": true, - "round_robin_gradients": false, - "legacy_stage1": false -} -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] zero_enabled ................. True -[2021-09-24 04:01:48,529] [INFO] [config.py:904:print] zero_optimization_stage ...... 1 -[2021-09-24 04:01:48,529] [INFO] [config.py:906:print] json = { - "train_micro_batch_size_per_gpu": 1, - "train_batch_size": 2.048000e+03, - "gradient_clipping": 1.0, - "zero_optimization": { - "stage": 1 - }, - "fp16": { - "enabled": true, - "loss_scale": 0, - "loss_scale_window": 500, - "hysteresis": 2, - "min_loss_scale": 1, - "initial_scale_power": 12 - }, - "steps_per_print": 2.000000e+03, - "wall_clock_breakdown": false -} -[2021-09-24 04:01:48,529] [INFO] [engine.py:76:__init__] CONFIG: micro_batches=256 micro_batch_size=1 -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=0 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=2 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=3 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=1 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=67 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=64 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=66 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=130 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=129 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=131 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=128 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=193 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=194 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=195 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=65 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=226 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=225 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=227 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=224 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=99 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=96 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=97 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=35 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=33 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=32 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=34 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=163 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=161 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=160 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=162 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=192 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 04:01:48,959] [INFO] [engine.py:134:__init__] RANK=98 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) - > using checkpoint value 6e-05 for learning rate - > using checkpoint value 6e-06 for minimum learning rate - > using checkpoint value 216320 for warmup iterations - > using checkpoint value 126953125 for total number of iterations - > using checkpoint value cosine for decay style -successfully loaded 8 ZeRO state_dicts for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 115 -successfully loaded 8 ZeRO state_dicts for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 126 -successfully loaded 8 ZeRO state_dicts for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 160 -successfully loaded 8 ZeRO state_dicts for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 68 -successfully loaded 8 ZeRO state_dicts for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 27 -successfully loaded 8 ZeRO state_dicts for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 49 -successfully loaded 8 ZeRO state_dicts for rank 71 -successfully loaded 8 ZeRO state_dicts for rank 147 -successfully loaded 8 ZeRO state_dicts for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 111 -successfully loaded 8 ZeRO state_dicts for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 44 -successfully loaded 8 ZeRO state_dicts for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 213 -successfully loaded 8 ZeRO state_dicts for rank 162 -successfully loaded 8 ZeRO state_dicts for rank 97 -successfully loaded 8 ZeRO state_dicts for rank 51 -successfully loaded 8 ZeRO state_dicts for rank 133 -loading 8 zero partition checkpoints for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 114 -successfully loaded 8 ZeRO state_dicts for rank 33 -successfully loaded 8 ZeRO state_dicts for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 181 -successfully loaded 8 ZeRO state_dicts for rank 41 -successfully loaded 8 ZeRO state_dicts for rank 185 -successfully loaded 8 ZeRO state_dicts for rank 241 -successfully loaded 8 ZeRO state_dicts for rank 134 -successfully loaded 8 ZeRO state_dicts for rank 39 -successfully loaded 8 ZeRO state_dicts for rank 24 -successfully loaded 8 ZeRO state_dicts for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 142 -successfully loaded 8 ZeRO state_dicts for rank 154 -successfully loaded 8 ZeRO state_dicts for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 35 -successfully loaded 8 ZeRO state_dicts for rank 70 -successfully loaded 8 ZeRO state_dicts for rank 75 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-24 04:02:16 CEST)" was missed by 0:00:03.600668 -successfully loaded 8 ZeRO state_dicts for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 161 -successfully loaded 8 ZeRO state_dicts for rank 243 -successfully loaded 8 ZeRO state_dicts for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 141 -successfully loaded 8 ZeRO state_dicts for rank 98 -successfully loaded 8 ZeRO state_dicts for rank 210 -successfully loaded 8 ZeRO state_dicts for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 28 -successfully loaded 8 ZeRO state_dicts for rank 110 -successfully loaded 8 ZeRO state_dicts for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 36 -successfully loaded 8 ZeRO state_dicts for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 26 -successfully loaded 8 ZeRO state_dicts for rank 84 -successfully loaded 8 ZeRO state_dicts for rank 208 -successfully loaded 8 ZeRO state_dicts for rank 190 -successfully loaded 8 ZeRO state_dicts for rank 92 -loading 8 zero partition checkpoints for rank 115 -successfully loaded 8 ZeRO state_dicts for rank 34 -successfully loaded 8 ZeRO state_dicts for rank 171 -successfully loaded 8 ZeRO state_dicts for rank 152 -successfully loaded 8 ZeRO state_dicts for rank 73 -successfully loaded 8 ZeRO state_dicts for rank 47 -successfully loaded 8 ZeRO state_dicts for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 69 -successfully loaded 8 ZeRO state_dicts for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 182 -successfully loaded 8 ZeRO state_dicts for rank 145 -successfully loaded 8 ZeRO state_dicts for rank 79 -successfully loaded 8 ZeRO state_dicts for rank 88 -successfully loaded 8 ZeRO state_dicts for rank 109 -successfully loaded 8 ZeRO state_dicts for rank 56 -successfully loaded 8 ZeRO state_dicts for rank 149 -successfully loaded 8 ZeRO state_dicts for rank 50 -successfully loaded 8 ZeRO state_dicts for rank 42 -successfully loaded 8 ZeRO state_dicts for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 196 -successfully loaded 8 ZeRO state_dicts for rank 80 -successfully loaded 8 ZeRO state_dicts for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 74 -successfully loaded 8 ZeRO state_dicts for rank 43 -successfully loaded 8 ZeRO state_dicts for rank 99 -successfully loaded 8 ZeRO state_dicts for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 78 -successfully loaded 8 ZeRO state_dicts for rank 37 -successfully loaded 8 ZeRO state_dicts for rank 216 -successfully loaded 8 ZeRO state_dicts for rank 153 -successfully loaded 8 ZeRO state_dicts for rank 77 -loading 8 zero partition checkpoints for rank 126 -loading 8 zero partition checkpoints for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 151 -successfully loaded 8 ZeRO state_dicts for rank 59 -successfully loaded 8 ZeRO state_dicts for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 100 -successfully loaded 8 ZeRO state_dicts for rank 107 -successfully loaded 8 ZeRO state_dicts for rank 90 -successfully loaded 8 ZeRO state_dicts for rank 130 -successfully loaded 8 ZeRO state_dicts for rank 163 -successfully loaded 8 ZeRO state_dicts for rank 164 -successfully loaded 8 ZeRO state_dicts for rank 205 -successfully loaded 8 ZeRO state_dicts for rank 94 -successfully loaded 8 ZeRO state_dicts for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 225 -successfully loaded 8 ZeRO state_dicts for rank 25 -successfully loaded 8 ZeRO state_dicts for rank 217 -successfully loaded 8 ZeRO state_dicts for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 15 -successfully loaded 8 ZeRO state_dicts for rank 131 -successfully loaded 8 ZeRO state_dicts for rank 46 -successfully loaded 8 ZeRO state_dicts for rank 170 -successfully loaded 8 ZeRO state_dicts for rank 198 -successfully loaded 8 ZeRO state_dicts for rank 58 -successfully loaded 8 ZeRO state_dicts for rank 248 -successfully loaded 8 ZeRO state_dicts for rank 13 -loading 8 zero partition checkpoints for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 183 -successfully loaded 8 ZeRO state_dicts for rank 64 -successfully loaded 8 ZeRO state_dicts for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 55 -successfully loaded 8 ZeRO state_dicts for rank 66 -successfully loaded 8 ZeRO state_dicts for rank 14 -successfully loaded 8 ZeRO state_dicts for rank 240 -successfully loaded 8 ZeRO state_dicts for rank 81 -successfully loaded 8 ZeRO state_dicts for rank 186 -successfully loaded 8 ZeRO state_dicts for rank 65 -successfully loaded 8 ZeRO state_dicts for rank 146 -successfully loaded 8 ZeRO state_dicts for rank 93 -successfully loaded 8 ZeRO state_dicts for rank 200 -successfully loaded 8 ZeRO state_dicts for rank 138 -successfully loaded 8 ZeRO state_dicts for rank 211 -successfully loaded 8 ZeRO state_dicts for rank 45 -successfully loaded 8 ZeRO state_dicts for rank 38 -successfully loaded 8 ZeRO state_dicts for rank 229 -successfully loaded 8 ZeRO state_dicts for rank 129 -successfully loaded 8 ZeRO state_dicts for rank 31 -successfully loaded 8 ZeRO state_dicts for rank 197 -successfully loaded 8 ZeRO state_dicts for rank 177 -successfully loaded 8 ZeRO state_dicts for rank 116 -successfully loaded 8 ZeRO state_dicts for rank 89 -successfully loaded 8 ZeRO state_dicts for rank 117 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-24 04:02:20 CEST)" was missed by 0:00:03.124446 -successfully loaded 8 ZeRO state_dicts for rank 23 -successfully loaded 8 ZeRO state_dicts for rank 188 -successfully loaded 8 ZeRO state_dicts for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 4 -successfully loaded 8 ZeRO state_dicts for rank 167 -successfully loaded 8 ZeRO state_dicts for rank 236 -loading 8 zero partition checkpoints for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 203 -successfully loaded 8 ZeRO state_dicts for rank 176 -successfully loaded 8 ZeRO state_dicts for rank 174 -successfully loaded 8 ZeRO state_dicts for rank 202 -successfully loaded 8 ZeRO state_dicts for rank 82 -successfully loaded 8 ZeRO state_dicts for rank 169 -loading 8 zero partition checkpoints for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 209 -successfully loaded 8 ZeRO state_dicts for rank 106 -successfully loaded 8 ZeRO state_dicts for rank 195 -successfully loaded 8 ZeRO state_dicts for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 8 -successfully loaded 8 ZeRO state_dicts for rank 178 -successfully loaded 8 ZeRO state_dicts for rank 219 -successfully loaded 8 ZeRO state_dicts for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 53 -successfully loaded 8 ZeRO state_dicts for rank 235 -successfully loaded 8 ZeRO state_dicts for rank 191 -loading 8 zero partition checkpoints for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 227 -successfully loaded 8 ZeRO state_dicts for rank 120 -successfully loaded 8 ZeRO state_dicts for rank 175 -successfully loaded 8 ZeRO state_dicts for rank 250 -successfully loaded 8 ZeRO state_dicts for rank 189 -successfully loaded 8 ZeRO state_dicts for rank 6 -successfully loaded 8 ZeRO state_dicts for rank 237 -successfully loaded 8 ZeRO state_dicts for rank 118 -successfully loaded 8 ZeRO state_dicts for rank 119 -loading 8 zero partition checkpoints for rank 68 -successfully loaded 8 ZeRO state_dicts for rank 22 -successfully loaded 8 ZeRO state_dicts for rank 91 -successfully loaded 8 ZeRO state_dicts for rank 86 -successfully loaded 8 ZeRO state_dicts for rank 83 -successfully loaded 8 ZeRO state_dicts for rank 87 -successfully loaded 8 ZeRO state_dicts for rank 121 -successfully loaded 8 ZeRO state_dicts for rank 218 -successfully loaded 8 ZeRO state_dicts for rank 221 -loading 8 zero partition checkpoints for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 9 -successfully loaded 8 ZeRO state_dicts for rank 222 -successfully loaded 8 ZeRO state_dicts for rank 251 -loading 8 zero partition checkpoints for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 179 -successfully loaded 8 ZeRO state_dicts for rank 247 -successfully loaded 8 ZeRO state_dicts for rank 12 -successfully loaded 8 ZeRO state_dicts for rank 29 -successfully loaded 8 ZeRO state_dicts for rank 95 -successfully loaded 8 ZeRO state_dicts for rank 231 -successfully loaded 8 ZeRO state_dicts for rank 239 -successfully loaded 8 ZeRO state_dicts for rank 245 -loading 8 zero partition checkpoints for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 255 -successfully loaded 8 ZeRO state_dicts for rank 232 -successfully loaded 8 ZeRO state_dicts for rank 238 -successfully loaded 8 ZeRO state_dicts for rank 7 -successfully loaded 8 ZeRO state_dicts for rank 228 -successfully loaded 8 ZeRO state_dicts for rank 67 -successfully loaded 8 ZeRO state_dicts for rank 252 -successfully loaded 8 ZeRO state_dicts for rank 187 -successfully loaded 8 ZeRO state_dicts for rank 230 -successfully loaded 8 ZeRO state_dicts for rank 244 -successfully loaded 8 ZeRO state_dicts for rank 194 -loading 8 zero partition checkpoints for rank 112 -loading 8 zero partition checkpoints for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 5 -successfully loaded 8 ZeRO state_dicts for rank 103 -loading 8 zero partition checkpoints for rank 111 -successfully loaded 8 ZeRO state_dicts for rank 21 -loading 8 zero partition checkpoints for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 165 -successfully loaded 8 ZeRO state_dicts for rank 54 -successfully loaded 8 ZeRO state_dicts for rank 102 -successfully loaded 8 ZeRO state_dicts for rank 233 -successfully loaded 8 ZeRO state_dicts for rank 85 -successfully loaded 8 ZeRO state_dicts for rank 223 -successfully loaded 8 ZeRO state_dicts for rank 11 -successfully loaded 8 ZeRO state_dicts for rank 226 -successfully loaded 8 ZeRO state_dicts for rank 101 -loading 8 zero partition checkpoints for rank 160 -loading 8 zero partition checkpoints for rank 143 -loading 8 zero partition checkpoints for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 199 -successfully loaded 8 ZeRO state_dicts for rank 1 -successfully loaded 8 ZeRO state_dicts for rank 173 -successfully loaded 8 ZeRO state_dicts for rank 20 -loading 8 zero partition checkpoints for rank 162 -loading 8 zero partition checkpoints for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 246 -successfully loaded 8 ZeRO state_dicts for rank 242 -successfully loaded 8 ZeRO state_dicts for rank 254 -successfully loaded 8 ZeRO state_dicts for rank 0 -successfully loaded 8 ZeRO state_dicts for rank 253 -successfully loaded 8 ZeRO state_dicts for rank 2 -loading 8 zero partition checkpoints for rank 27 -loading 8 zero partition checkpoints for rank 201 -loading 8 zero partition checkpoints for rank 33 -successfully loaded 8 ZeRO state_dicts for rank 224 -loading 8 zero partition checkpoints for rank 185 -loading 8 zero partition checkpoints for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 122 -loading 8 zero partition checkpoints for rank 214 -loading 8 zero partition checkpoints for rank 181 -loading 8 zero partition checkpoints for rank 114 -loading 8 zero partition checkpoints for rank 39 -loading 8 zero partition checkpoints for rank 154 -successfully loaded 8 ZeRO state_dicts for rank 10 -loading 8 zero partition checkpoints for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 249 -loading 8 zero partition checkpoints for rank 147 -successfully loaded 8 ZeRO state_dicts for rank 123 -successfully loaded 8 ZeRO state_dicts for rank 57 -loading 8 zero partition checkpoints for rank 213 -loading 8 zero partition checkpoints for rank 133 -loading 8 zero partition checkpoints for rank 35 -loading 8 zero partition checkpoints for rank 41 -loading 8 zero partition checkpoints for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 3 -loading 8 zero partition checkpoints for rank 75 -loading 8 zero partition checkpoints for rank 148 -loading 8 zero partition checkpoints for rank 104 -loading 8 zero partition checkpoints for rank 142 -successfully loaded 8 ZeRO state_dicts for rank 234 -loading 8 zero partition checkpoints for rank 210 -loading 8 zero partition checkpoints for rank 52 -loading 8 zero partition checkpoints for rank 134 -loading 8 zero partition checkpoints for rank 70 -loading 8 zero partition checkpoints for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 30 -loading 8 zero partition checkpoints for rank 161 -loading 8 zero partition checkpoints for rank 190 -loading 8 zero partition checkpoints for rank 51 -loading 8 zero partition checkpoints for rank 168 -loading 8 zero partition checkpoints for rank 158 -loading 8 zero partition checkpoints for rank 208 -loading 8 zero partition checkpoints for rank 97 -loading 8 zero partition checkpoints for rank 73 -loading 8 zero partition checkpoints for rank 152 -loading 8 zero partition checkpoints for rank 34 -loading 8 zero partition checkpoints for rank 79 -loading 8 zero partition checkpoints for rank 108 -loading 8 zero partition checkpoints for rank 241 -loading 8 zero partition checkpoints for rank 26 -loading 8 zero partition checkpoints for rank 88 -loading 8 zero partition checkpoints for rank 109 -loading 8 zero partition checkpoints for rank 157 -loading 8 zero partition checkpoints for rank 40 -loading 8 zero partition checkpoints for rank 28 -loading 8 zero partition checkpoints for rank 36 -loading 8 zero partition checkpoints for rank 215 -loading 8 zero partition checkpoints for rank 43 -loading 8 zero partition checkpoints for rank 80 -loading 8 zero partition checkpoints for rank 47 -loading 8 zero partition checkpoints for rank 192 -loading 8 zero partition checkpoints for rank 78 -loading 8 zero partition checkpoints for rank 150 -loading 8 zero partition checkpoints for rank 153 -loading 8 zero partition checkpoints for rank 171 -loading 8 zero partition checkpoints for rank 182 -loading 8 zero partition checkpoints for rank 151 -loading 8 zero partition checkpoints for rank 140 -loading 8 zero partition checkpoints for rank 159 -loading 8 zero partition checkpoints for rank 149 -loading 8 zero partition checkpoints for rank 74 -loading 8 zero partition checkpoints for rank 77 -loading 8 zero partition checkpoints for rank 71 -loading 8 zero partition checkpoints for rank 141 -loading 8 zero partition checkpoints for rank 98 -loading 8 zero partition checkpoints for rank 128 -loading 8 zero partition checkpoints for rank 206 -loading 8 zero partition checkpoints for rank 164 -loading 8 zero partition checkpoints for rank 144 -loading 8 zero partition checkpoints for rank 62 -loading 8 zero partition checkpoints for rank 198 -loading 8 zero partition checkpoints for rank 170 -loading 8 zero partition checkpoints for rank 180 -loading 8 zero partition checkpoints for rank 130 -loading 8 zero partition checkpoints for rank 216 -loading 8 zero partition checkpoints for rank 100 -loading 8 zero partition checkpoints for rank 183 -loading 8 zero partition checkpoints for rank 38 -loading 8 zero partition checkpoints for rank 205 -loading 8 zero partition checkpoints for rank 163 -loading 8 zero partition checkpoints for rank 138 -loading 8 zero partition checkpoints for rank 184 -loading 8 zero partition checkpoints for rank 64 -loading 8 zero partition checkpoints for rank 145 -loading 8 zero partition checkpoints for rank 211 -loading 8 zero partition checkpoints for rank 186 -loading 8 zero partition checkpoints for rank 217 -loading 8 zero partition checkpoints for rank 81 -loading 8 zero partition checkpoints for rank 146 -loading 8 zero partition checkpoints for rank 96 -loading 8 zero partition checkpoints for rank 137 -loading 8 zero partition checkpoints for rank 42 -loading 8 zero partition checkpoints for rank 37 -loading 8 zero partition checkpoints for rank 44 -loading 8 zero partition checkpoints for rank 203 -loading 8 zero partition checkpoints for rank 89 -loading 8 zero partition checkpoints for rank 69 -loading 8 zero partition checkpoints for rank 167 -loading 8 zero partition checkpoints for rank 225 -loading 8 zero partition checkpoints for rank 219 -loading 8 zero partition checkpoints for rank 117 -loading 8 zero partition checkpoints for rank 136 -loading 8 zero partition checkpoints for rank 209 -loading 8 zero partition checkpoints for rank 65 -loading 8 zero partition checkpoints for rank 45 -loading 8 zero partition checkpoints for rank 202 -loading 8 zero partition checkpoints for rank 166 -loading 8 zero partition checkpoints for rank 106 -loading 8 zero partition checkpoints for rank 13 -loading 8 zero partition checkpoints for rank 196 -loading 8 zero partition checkpoints for rank 178 -loading 8 zero partition checkpoints for rank 107 -loading 8 zero partition checkpoints for rank 200 -loading 8 zero partition checkpoints for rank 189 -loading 8 zero partition checkpoints for rank 92 -loading 8 zero partition checkpoints for rank 110 -loading 8 zero partition checkpoints for rank 82 -loading 8 zero partition checkpoints for rank 86 -loading 8 zero partition checkpoints for rank 4 -loading 8 zero partition checkpoints for rank 240 -loading 8 zero partition checkpoints for rank 83 -loading 8 zero partition checkpoints for rank 56 -loading 8 zero partition checkpoints for rank 118 -loading 8 zero partition checkpoints for rank 176 -loading 8 zero partition checkpoints for rank 105 -loading 8 zero partition checkpoints for rank 177 -loading 8 zero partition checkpoints for rank 221 -loading 8 zero partition checkpoints for rank 222 -loading 8 zero partition checkpoints for rank 218 -loading 8 zero partition checkpoints for rank 49 -loading 8 zero partition checkpoints for rank 169 -loading 8 zero partition checkpoints for rank 194 -loading 8 zero partition checkpoints for rank 54 -loading 8 zero partition checkpoints for rank 250 -loading 8 zero partition checkpoints for rank 103 -loading 8 zero partition checkpoints for rank 199 -loading 8 zero partition checkpoints for rank 187 -loading 8 zero partition checkpoints for rank 12 -loading 8 zero partition checkpoints for rank 179 -loading 8 zero partition checkpoints for rank 29 -loading 8 zero partition checkpoints for rank 55 -loading 8 zero partition checkpoints for rank 197 -loading 8 zero partition checkpoints for rank 24 -loading 8 zero partition checkpoints for rank 85 -loading 8 zero partition checkpoints for rank 58 -loading 8 zero partition checkpoints for rank 22 -loading 8 zero partition checkpoints for rank 131 -loading 8 zero partition checkpoints for rank 229 -loading 8 zero partition checkpoints for rank 99 -loading 8 zero partition checkpoints for rank 90 -loading 8 zero partition checkpoints for rank 232 -loading 8 zero partition checkpoints for rank 193 -loading 8 zero partition checkpoints for rank 239 -loading 8 zero partition checkpoints for rank 23 -loading 8 zero partition checkpoints for rank 94 -loading 8 zero partition checkpoints for rank 236 -loading 8 zero partition checkpoints for rank 129 -loading 8 zero partition checkpoints for rank 251 -loading 8 zero partition checkpoints for rank 46 -loading 8 zero partition checkpoints for rank 21 -loading 8 zero partition checkpoints for rank 252 -loading 8 zero partition checkpoints for rank 238 -loading 8 zero partition checkpoints for rank 7 -loading 8 zero partition checkpoints for rank 53 -loading 8 zero partition checkpoints for rank 84 -loading 8 zero partition checkpoints for rank 254 -loading 8 zero partition checkpoints for rank 6 -loading 8 zero partition checkpoints for rank 245 -loading 8 zero partition checkpoints for rank 246 -loading 8 zero partition checkpoints for rank 243 -loading 8 zero partition checkpoints for rank 233 -loading 8 zero partition checkpoints for rank 1 -loading 8 zero partition checkpoints for rank 50 -loading 8 zero partition checkpoints for rank 220 -loading 8 zero partition checkpoints for rank 195 -loading 8 zero partition checkpoints for rank 237 -loading 8 zero partition checkpoints for rank 165 -loading 8 zero partition checkpoints for rank 230 -loading 8 zero partition checkpoints for rank 224 -loading 8 zero partition checkpoints for rank 207 -loading 8 zero partition checkpoints for rank 2 -loading 8 zero partition checkpoints for rank 66 -loading 8 zero partition checkpoints for rank 204 -loading 8 zero partition checkpoints for rank 59 -loading 8 zero partition checkpoints for rank 25 -loading 8 zero partition checkpoints for rank 5 -loading 8 zero partition checkpoints for rank 228 -loading 8 zero partition checkpoints for rank 91 -loading 8 zero partition checkpoints for rank 231 -loading 8 zero partition checkpoints for rank 116 -loading 8 zero partition checkpoints for rank 102 -loading 8 zero partition checkpoints for rank 20 -loading 8 zero partition checkpoints for rank 119 -loading 8 zero partition checkpoints for rank 101 -loading 8 zero partition checkpoints for rank 67 -loading 8 zero partition checkpoints for rank 93 -loading 8 zero partition checkpoints for rank 242 -loading 8 zero partition checkpoints for rank 188 -loading 8 zero partition checkpoints for rank 87 -loading 8 zero partition checkpoints for rank 247 -loading 8 zero partition checkpoints for rank 0 -loading 8 zero partition checkpoints for rank 244 - checkpoint version 3.0 -loading 8 zero partition checkpoints for rank 223 -loading 8 zero partition checkpoints for rank 191 -loading 8 zero partition checkpoints for rank 31 -loading 8 zero partition checkpoints for rank 57 -loading 8 zero partition checkpoints for rank 95 -loading 8 zero partition checkpoints for rank 15 -loading 8 zero partition checkpoints for rank 248 -loading 8 zero partition checkpoints for rank 120 -loading 8 zero partition checkpoints for rank 14 -loading 8 zero partition checkpoints for rank 235 -loading 8 zero partition checkpoints for rank 3 -loading 8 zero partition checkpoints for rank 121 -loading 8 zero partition checkpoints for rank 255 -loading 8 zero partition checkpoints for rank 172 -loading 8 zero partition checkpoints for rank 253 -loading 8 zero partition checkpoints for rank 227 -loading 8 zero partition checkpoints for rank 249 -loading 8 zero partition checkpoints for rank 30 -loading 8 zero partition checkpoints for rank 174 -loading 8 zero partition checkpoints for rank 226 -loading 8 zero partition checkpoints for rank 234 -loading 8 zero partition checkpoints for rank 175 -loading 8 zero partition checkpoints for rank 173 -loading 8 zero partition checkpoints for rank 122 -loading 8 zero partition checkpoints for rank 123 -loading 8 zero partition checkpoints for rank 8 -loading 8 zero partition checkpoints for rank 9 -loading 8 zero partition checkpoints for rank 11 -loading 8 zero partition checkpoints for rank 10 -successfully loaded 8 ZeRO state_dicts for rank 18 -successfully loaded 8 ZeRO state_dicts for rank 16 -successfully loaded 8 ZeRO state_dicts for rank 17 -loading 8 zero partition checkpoints for rank 18 -successfully loaded 8 ZeRO state_dicts for rank 19 -loading 8 zero partition checkpoints for rank 16 -loading 8 zero partition checkpoints for rank 17 -loading 8 zero partition checkpoints for rank 19 - successfully loaded checkpoint from /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints at iteration 474 -time (ms) | load-checkpoint: 86577.34 -[after model, optimizer, and learning rate scheduler are built] datetime: 2021-09-24 04:03:15 -> building train, validation, and test datasets ... - > datasets target sizes (minimum size): - train: 300000000 - validation: 1638400 - test: 10240 -> building train, validation, and test datasets for GPT ... - > building dataset index ... - reading sizes... - reading pointers... - reading document index... - creating numpy buffer of mmap... - creating memory view of numpy buffer... - > finished creating indexed dataset in 0.164226 seconds - number of documents: 304230423 - > dataset split: - train: - document indices in [0, 288714672) total of 288714672 documents - validation: - document indices in [288714672, 303926193) total of 15211521 documents - test: - document indices in [303926193, 304230423) total of 304230 documents - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.365 seconds - total number of samples: 394611670 - total number of epochs: 3 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.203 seconds - total number of samples: 6927161 - total number of epochs: 1 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.072 seconds - total number of samples: 137384 - total number of epochs: 1 -> finished creating GPT datasets ... -[after dataloaders are built] datetime: 2021-09-24 04:03:22 -done with setup ... -training ... -time (ms) | model-and-optimizer-setup: 94922.27 | train/valid/test-data-iterators-setup: 5644.20 -[before the start of training step] datetime: 2021-09-24 04:03:22 -[2021-09-24 04:03:22,280] [INFO] [checkpointing.py:408:forward] Activation Checkpointing Information -[2021-09-24 04:03:22,280] [INFO] [checkpointing.py:409:forward] ----Partition Activations False, CPU CHECKPOINTING False -[2021-09-24 04:03:22,281] [INFO] [checkpointing.py:412:forward] ----contiguous Memory Checkpointing False with 32 total layers -[2021-09-24 04:03:22,281] [INFO] [checkpointing.py:415:forward] ----Synchronization False -[2021-09-24 04:03:22,281] [INFO] [checkpointing.py:416:forward] ----Profiling time in checkpointing False -[2021-09-24 04:03:47] PULSE: tr8-104B is waiting to be scheduled (1159457_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 04:03:47] PULSE: tr8-104B is scheduled to start in 18:10:24 (at 2021-09-24T22:14:12) (1161605 on 'gpu_p13' partition) -[2021-09-24 04:03:47] PULSE: tr8-104B is running for 2:42 since 2021-09-24T04:01:05 (1162747 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) -[Rank 33] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 65] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18826.0 | max reserved: 18826.0 -[Rank 1] (after 475 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 21150.0 | max reserved: 21150.0 -[Rank 225] (after 475 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 22108.0 | max reserved: 22108.0 -[Rank 97] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 129] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 193] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18778.0 | max reserved: 18778.0 -[Rank 161] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 2] (after 475 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 22878.0 | max reserved: 22878.0 -[Rank 226] (after 475 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 20752.0 | max reserved: 20752.0 -[Rank 34] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 66] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 98] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18522.0 | max reserved: 18522.0 -[Rank 130] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 194] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 162] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 0] (after 475 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 23514.0 | max reserved: 23514.0 -[Rank 224] (after 475 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 22108.0 | max reserved: 22108.0 -[Rank 32] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 64] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 96] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 192] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18884.0 | max reserved: 18884.0 -[Rank 128] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18884.0 | max reserved: 18884.0 -[Rank 160] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18868.0 | max reserved: 18868.0 -[Rank 3] (after 475 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 22890.0 | max reserved: 22890.0 -[Rank 35] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 227] (after 475 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 20752.0 | max reserved: 20752.0 -[Rank 67] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 99] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 131] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18522.0 | max reserved: 18522.0 -[Rank 195] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 163] (after 475 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 - iteration 475/ 159576 | consumed samples: 7600 | elapsed time per iteration (ms): 29962.7 | learning rate: 2.108E-06 | global batch size: 16 | lm loss: 7.833103E+00 | loss scale: 4096.0 | grad norm: 47969.708 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 476/ 159576 | consumed samples: 7616 | elapsed time per iteration (ms): 13562.3 | learning rate: 2.112E-06 | global batch size: 16 | lm loss: 7.715385E+00 | loss scale: 4096.0 | grad norm: 28643.174 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 477/ 159576 | consumed samples: 7632 | elapsed time per iteration (ms): 14532.6 | learning rate: 2.117E-06 | global batch size: 16 | lm loss: 7.912835E+00 | loss scale: 4096.0 | grad norm: 18978.073 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 478/ 159576 | consumed samples: 7648 | elapsed time per iteration (ms): 13659.0 | learning rate: 2.121E-06 | global batch size: 16 | lm loss: 7.845491E+00 | loss scale: 4096.0 | grad norm: 29417.161 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 479/ 159576 | consumed samples: 7664 | elapsed time per iteration (ms): 13928.5 | learning rate: 2.126E-06 | global batch size: 16 | lm loss: 7.818515E+00 | loss scale: 4096.0 | grad norm: 24185.570 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 480/ 159576 | consumed samples: 7680 | elapsed time per iteration (ms): 13863.2 | learning rate: 2.130E-06 | global batch size: 16 | lm loss: 7.759526E+00 | loss scale: 4096.0 | grad norm: 18058.893 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 481/ 159576 | consumed samples: 7696 | elapsed time per iteration (ms): 13613.0 | learning rate: 2.135E-06 | global batch size: 16 | lm loss: 7.666837E+00 | loss scale: 4096.0 | grad norm: 21581.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 482/ 159576 | consumed samples: 7712 | elapsed time per iteration (ms): 13350.8 | learning rate: 2.139E-06 | global batch size: 16 | lm loss: 7.929407E+00 | loss scale: 4096.0 | grad norm: 22311.348 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 483/ 159576 | consumed samples: 7728 | elapsed time per iteration (ms): 13819.2 | learning rate: 2.143E-06 | global batch size: 16 | lm loss: 7.786575E+00 | loss scale: 4096.0 | grad norm: 23821.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 484/ 159576 | consumed samples: 7744 | elapsed time per iteration (ms): 13697.3 | learning rate: 2.148E-06 | global batch size: 16 | lm loss: 7.834505E+00 | loss scale: 4096.0 | grad norm: 18706.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 485/ 159576 | consumed samples: 7760 | elapsed time per iteration (ms): 13285.4 | learning rate: 2.152E-06 | global batch size: 16 | lm loss: 7.796403E+00 | loss scale: 4096.0 | grad norm: 23055.088 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 486/ 159576 | consumed samples: 7776 | elapsed time per iteration (ms): 13893.0 | learning rate: 2.157E-06 | global batch size: 16 | lm loss: 7.853868E+00 | loss scale: 4096.0 | grad norm: 16300.893 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 487/ 159576 | consumed samples: 7792 | elapsed time per iteration (ms): 14059.7 | learning rate: 2.161E-06 | global batch size: 16 | lm loss: 7.943846E+00 | loss scale: 4096.0 | grad norm: 18420.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 488/ 159576 | consumed samples: 7808 | elapsed time per iteration (ms): 13994.0 | learning rate: 2.166E-06 | global batch size: 16 | lm loss: 7.850654E+00 | loss scale: 4096.0 | grad norm: 17235.839 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 489/ 159576 | consumed samples: 7824 | elapsed time per iteration (ms): 13596.2 | learning rate: 2.170E-06 | global batch size: 16 | lm loss: 7.825228E+00 | loss scale: 4096.0 | grad norm: 16217.059 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 490/ 159576 | consumed samples: 7840 | elapsed time per iteration (ms): 14562.4 | learning rate: 2.175E-06 | global batch size: 16 | lm loss: 7.944909E+00 | loss scale: 4096.0 | grad norm: 20367.528 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 491/ 159576 | consumed samples: 7856 | elapsed time per iteration (ms): 13373.8 | learning rate: 2.179E-06 | global batch size: 16 | lm loss: 7.772738E+00 | loss scale: 4096.0 | grad norm: 14868.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 492/ 159576 | consumed samples: 7872 | elapsed time per iteration (ms): 13407.0 | learning rate: 2.183E-06 | global batch size: 16 | lm loss: 7.807293E+00 | loss scale: 4096.0 | grad norm: 12933.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 493/ 159576 | consumed samples: 7888 | elapsed time per iteration (ms): 13535.9 | learning rate: 2.188E-06 | global batch size: 16 | lm loss: 7.796512E+00 | loss scale: 4096.0 | grad norm: 14067.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 494/ 159576 | consumed samples: 7904 | elapsed time per iteration (ms): 13629.5 | learning rate: 2.192E-06 | global batch size: 16 | lm loss: 7.792056E+00 | loss scale: 4096.0 | grad norm: 14953.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 495/ 159576 | consumed samples: 7920 | elapsed time per iteration (ms): 14163.4 | learning rate: 2.197E-06 | global batch size: 16 | lm loss: 7.703032E+00 | loss scale: 4096.0 | grad norm: 14533.162 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 496/ 159576 | consumed samples: 7936 | elapsed time per iteration (ms): 13588.6 | learning rate: 2.201E-06 | global batch size: 16 | lm loss: 7.740438E+00 | loss scale: 4096.0 | grad norm: 13505.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 497/ 159576 | consumed samples: 7952 | elapsed time per iteration (ms): 13861.0 | learning rate: 2.206E-06 | global batch size: 16 | lm loss: 7.741710E+00 | loss scale: 4096.0 | grad norm: 15979.829 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 498/ 159576 | consumed samples: 7968 | elapsed time per iteration (ms): 13984.2 | learning rate: 2.210E-06 | global batch size: 16 | lm loss: 7.999316E+00 | loss scale: 4096.0 | grad norm: 17409.113 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 499/ 159576 | consumed samples: 7984 | elapsed time per iteration (ms): 13944.3 | learning rate: 2.214E-06 | global batch size: 16 | lm loss: 7.852047E+00 | loss scale: 4096.0 | grad norm: 17274.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 500/ 159576 | consumed samples: 8000 | elapsed time per iteration (ms): 13842.0 | learning rate: 2.219E-06 | global batch size: 16 | lm loss: 7.828729E+00 | loss scale: 8192.0 | grad norm: 13323.901 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 501/ 159576 | consumed samples: 8016 | elapsed time per iteration (ms): 13887.5 | learning rate: 2.223E-06 | global batch size: 16 | lm loss: 7.889397E+00 | loss scale: 8192.0 | grad norm: 36733.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 502/ 159576 | consumed samples: 8032 | elapsed time per iteration (ms): 14250.0 | learning rate: 2.228E-06 | global batch size: 16 | lm loss: 7.699535E+00 | loss scale: 8192.0 | grad norm: 25128.484 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 503/ 159576 | consumed samples: 8048 | elapsed time per iteration (ms): 14013.2 | learning rate: 2.232E-06 | global batch size: 16 | lm loss: 7.717435E+00 | loss scale: 8192.0 | grad norm: 27928.260 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 504/ 159576 | consumed samples: 8064 | elapsed time per iteration (ms): 13885.3 | learning rate: 2.237E-06 | global batch size: 16 | lm loss: 7.793045E+00 | loss scale: 8192.0 | grad norm: 25342.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 505/ 159576 | consumed samples: 8080 | elapsed time per iteration (ms): 14216.7 | learning rate: 2.241E-06 | global batch size: 16 | lm loss: 7.810180E+00 | loss scale: 8192.0 | grad norm: 32722.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 506/ 159576 | consumed samples: 8096 | elapsed time per iteration (ms): 13476.3 | learning rate: 2.246E-06 | global batch size: 16 | lm loss: 7.789536E+00 | loss scale: 8192.0 | grad norm: 28438.282 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 507/ 159576 | consumed samples: 8112 | elapsed time per iteration (ms): 13866.3 | learning rate: 2.250E-06 | global batch size: 16 | lm loss: 7.752525E+00 | loss scale: 8192.0 | grad norm: 38662.247 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 508/ 159576 | consumed samples: 8128 | elapsed time per iteration (ms): 14262.5 | learning rate: 2.254E-06 | global batch size: 16 | lm loss: 7.916237E+00 | loss scale: 8192.0 | grad norm: 36720.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 509/ 159576 | consumed samples: 8144 | elapsed time per iteration (ms): 13929.6 | learning rate: 2.259E-06 | global batch size: 16 | lm loss: 7.943053E+00 | loss scale: 8192.0 | grad norm: 38847.168 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 510/ 159576 | consumed samples: 8160 | elapsed time per iteration (ms): 13830.3 | learning rate: 2.263E-06 | global batch size: 16 | lm loss: 7.853089E+00 | loss scale: 8192.0 | grad norm: 37581.397 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 511/ 159576 | consumed samples: 8176 | elapsed time per iteration (ms): 13826.8 | learning rate: 2.268E-06 | global batch size: 16 | lm loss: 7.664119E+00 | loss scale: 8192.0 | grad norm: 34046.642 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 512/ 159576 | consumed samples: 8192 | elapsed time per iteration (ms): 14623.1 | learning rate: 2.272E-06 | global batch size: 16 | lm loss: 7.786874E+00 | loss scale: 8192.0 | grad norm: 28303.899 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 513/ 159576 | consumed samples: 8208 | elapsed time per iteration (ms): 13633.3 | learning rate: 2.277E-06 | global batch size: 16 | lm loss: 7.763934E+00 | loss scale: 8192.0 | grad norm: 32905.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 514/ 159576 | consumed samples: 8224 | elapsed time per iteration (ms): 13562.5 | learning rate: 2.281E-06 | global batch size: 16 | lm loss: 7.825607E+00 | loss scale: 8192.0 | grad norm: 32400.005 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 515/ 159576 | consumed samples: 8240 | elapsed time per iteration (ms): 13893.1 | learning rate: 2.286E-06 | global batch size: 16 | lm loss: 7.780645E+00 | loss scale: 8192.0 | grad norm: 39597.501 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 516/ 159576 | consumed samples: 8256 | elapsed time per iteration (ms): 13943.0 | learning rate: 2.290E-06 | global batch size: 16 | lm loss: 7.949652E+00 | loss scale: 8192.0 | grad norm: 29624.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 517/ 159576 | consumed samples: 8272 | elapsed time per iteration (ms): 13457.2 | learning rate: 2.294E-06 | global batch size: 16 | lm loss: 7.840482E+00 | loss scale: 8192.0 | grad norm: 34709.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 04:13:42] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1162855_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 04:13:42] PULSE: tr8-104B is running for 12:37 since 2021-09-24T04:01:05 (1162747 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 518/ 159576 | consumed samples: 8288 | elapsed time per iteration (ms): 13506.3 | learning rate: 2.299E-06 | global batch size: 16 | lm loss: 7.914812E+00 | loss scale: 8192.0 | grad norm: 24295.892 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 519/ 159576 | consumed samples: 8304 | elapsed time per iteration (ms): 14169.8 | learning rate: 2.303E-06 | global batch size: 16 | lm loss: 7.710842E+00 | loss scale: 8192.0 | grad norm: 32528.032 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 520/ 159576 | consumed samples: 8320 | elapsed time per iteration (ms): 13829.9 | learning rate: 2.308E-06 | global batch size: 16 | lm loss: 7.806552E+00 | loss scale: 8192.0 | grad norm: 37677.096 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 521/ 159576 | consumed samples: 8336 | elapsed time per iteration (ms): 13564.6 | learning rate: 2.312E-06 | global batch size: 16 | lm loss: 7.817222E+00 | loss scale: 8192.0 | grad norm: 30827.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 522/ 159576 | consumed samples: 8352 | elapsed time per iteration (ms): 13848.1 | learning rate: 2.317E-06 | global batch size: 16 | lm loss: 7.805755E+00 | loss scale: 8192.0 | grad norm: 31599.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 523/ 159576 | consumed samples: 8368 | elapsed time per iteration (ms): 13893.6 | learning rate: 2.321E-06 | global batch size: 16 | lm loss: 7.845006E+00 | loss scale: 8192.0 | grad norm: 34359.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 524/ 159576 | consumed samples: 8384 | elapsed time per iteration (ms): 13874.2 | learning rate: 2.325E-06 | global batch size: 16 | lm loss: 7.806132E+00 | loss scale: 8192.0 | grad norm: 34509.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 525/ 159576 | consumed samples: 8400 | elapsed time per iteration (ms): 14357.0 | learning rate: 2.330E-06 | global batch size: 16 | lm loss: 7.713592E+00 | loss scale: 8192.0 | grad norm: 36961.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 526/ 159576 | consumed samples: 8416 | elapsed time per iteration (ms): 14049.5 | learning rate: 2.334E-06 | global batch size: 16 | lm loss: 7.744096E+00 | loss scale: 8192.0 | grad norm: 46754.633 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 527/ 159576 | consumed samples: 8432 | elapsed time per iteration (ms): 14142.6 | learning rate: 2.339E-06 | global batch size: 16 | lm loss: 7.798402E+00 | loss scale: 8192.0 | grad norm: 38396.563 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 528/ 159576 | consumed samples: 8448 | elapsed time per iteration (ms): 13474.9 | learning rate: 2.343E-06 | global batch size: 16 | lm loss: 7.987565E+00 | loss scale: 8192.0 | grad norm: 36935.417 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 529/ 159576 | consumed samples: 8464 | elapsed time per iteration (ms): 14180.8 | learning rate: 2.348E-06 | global batch size: 16 | lm loss: 7.766053E+00 | loss scale: 8192.0 | grad norm: 35413.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 530/ 159576 | consumed samples: 8480 | elapsed time per iteration (ms): 13844.6 | learning rate: 2.352E-06 | global batch size: 16 | lm loss: 7.906172E+00 | loss scale: 8192.0 | grad norm: 26808.092 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 531/ 159576 | consumed samples: 8496 | elapsed time per iteration (ms): 13786.0 | learning rate: 2.357E-06 | global batch size: 16 | lm loss: 7.840616E+00 | loss scale: 8192.0 | grad norm: 38477.035 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 532/ 159576 | consumed samples: 8512 | elapsed time per iteration (ms): 13935.0 | learning rate: 2.361E-06 | global batch size: 16 | lm loss: 7.367872E+00 | loss scale: 8192.0 | grad norm: 51156.079 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 533/ 159576 | consumed samples: 8528 | elapsed time per iteration (ms): 14022.6 | learning rate: 2.365E-06 | global batch size: 16 | lm loss: 7.941976E+00 | loss scale: 8192.0 | grad norm: 46439.024 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 534/ 159576 | consumed samples: 8544 | elapsed time per iteration (ms): 14296.7 | learning rate: 2.370E-06 | global batch size: 16 | lm loss: 7.869607E+00 | loss scale: 8192.0 | grad norm: 29876.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 535/ 159576 | consumed samples: 8560 | elapsed time per iteration (ms): 13470.0 | learning rate: 2.374E-06 | global batch size: 16 | lm loss: 7.635067E+00 | loss scale: 8192.0 | grad norm: 34076.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 536/ 159576 | consumed samples: 8576 | elapsed time per iteration (ms): 13796.1 | learning rate: 2.379E-06 | global batch size: 16 | lm loss: 7.842813E+00 | loss scale: 8192.0 | grad norm: 41800.450 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 537/ 159576 | consumed samples: 8592 | elapsed time per iteration (ms): 13818.0 | learning rate: 2.383E-06 | global batch size: 16 | lm loss: 7.984433E+00 | loss scale: 8192.0 | grad norm: 38203.372 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 538/ 159576 | consumed samples: 8608 | elapsed time per iteration (ms): 14109.2 | learning rate: 2.388E-06 | global batch size: 16 | lm loss: 7.724606E+00 | loss scale: 8192.0 | grad norm: 44792.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 539/ 159576 | consumed samples: 8624 | elapsed time per iteration (ms): 13906.3 | learning rate: 2.392E-06 | global batch size: 16 | lm loss: 7.800515E+00 | loss scale: 8192.0 | grad norm: 32297.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 540/ 159576 | consumed samples: 8640 | elapsed time per iteration (ms): 14143.5 | learning rate: 2.396E-06 | global batch size: 16 | lm loss: 7.871832E+00 | loss scale: 8192.0 | grad norm: 43120.437 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 541/ 159576 | consumed samples: 8656 | elapsed time per iteration (ms): 14084.0 | learning rate: 2.401E-06 | global batch size: 16 | lm loss: 7.872537E+00 | loss scale: 8192.0 | grad norm: 36867.265 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 542/ 159576 | consumed samples: 8672 | elapsed time per iteration (ms): 13874.8 | learning rate: 2.405E-06 | global batch size: 16 | lm loss: 7.777860E+00 | loss scale: 8192.0 | grad norm: 43001.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 543/ 159576 | consumed samples: 8688 | elapsed time per iteration (ms): 13779.4 | learning rate: 2.410E-06 | global batch size: 16 | lm loss: 7.682357E+00 | loss scale: 8192.0 | grad norm: 57139.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 544/ 159576 | consumed samples: 8704 | elapsed time per iteration (ms): 14017.8 | learning rate: 2.414E-06 | global batch size: 16 | lm loss: 7.819186E+00 | loss scale: 8192.0 | grad norm: 29983.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 545/ 159576 | consumed samples: 8720 | elapsed time per iteration (ms): 13847.0 | learning rate: 2.419E-06 | global batch size: 16 | lm loss: 7.843667E+00 | loss scale: 8192.0 | grad norm: 66015.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 546/ 159576 | consumed samples: 8736 | elapsed time per iteration (ms): 13982.1 | learning rate: 2.423E-06 | global batch size: 16 | lm loss: 7.894298E+00 | loss scale: 8192.0 | grad norm: 51768.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 547/ 159576 | consumed samples: 8752 | elapsed time per iteration (ms): 14302.0 | learning rate: 2.428E-06 | global batch size: 16 | lm loss: 7.715273E+00 | loss scale: 8192.0 | grad norm: 39105.868 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 548/ 159576 | consumed samples: 8768 | elapsed time per iteration (ms): 14035.0 | learning rate: 2.432E-06 | global batch size: 16 | lm loss: 7.707379E+00 | loss scale: 8192.0 | grad norm: 39549.896 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 549/ 159576 | consumed samples: 8784 | elapsed time per iteration (ms): 13590.6 | learning rate: 2.436E-06 | global batch size: 16 | lm loss: 7.786090E+00 | loss scale: 8192.0 | grad norm: 29894.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 550/ 159576 | consumed samples: 8800 | elapsed time per iteration (ms): 13742.1 | learning rate: 2.441E-06 | global batch size: 16 | lm loss: 7.726188E+00 | loss scale: 8192.0 | grad norm: 34821.397 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 551/ 159576 | consumed samples: 8816 | elapsed time per iteration (ms): 13975.5 | learning rate: 2.445E-06 | global batch size: 16 | lm loss: 7.823754E+00 | loss scale: 8192.0 | grad norm: 41726.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 552/ 159576 | consumed samples: 8832 | elapsed time per iteration (ms): 13862.7 | learning rate: 2.450E-06 | global batch size: 16 | lm loss: 7.780801E+00 | loss scale: 8192.0 | grad norm: 39107.293 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 553/ 159576 | consumed samples: 8848 | elapsed time per iteration (ms): 13828.8 | learning rate: 2.454E-06 | global batch size: 16 | lm loss: 7.722218E+00 | loss scale: 8192.0 | grad norm: 34436.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 554/ 159576 | consumed samples: 8864 | elapsed time per iteration (ms): 14180.4 | learning rate: 2.459E-06 | global batch size: 16 | lm loss: 7.731545E+00 | loss scale: 8192.0 | grad norm: 26819.965 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 555/ 159576 | consumed samples: 8880 | elapsed time per iteration (ms): 14282.2 | learning rate: 2.463E-06 | global batch size: 16 | lm loss: 7.705241E+00 | loss scale: 8192.0 | grad norm: 49659.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 556/ 159576 | consumed samples: 8896 | elapsed time per iteration (ms): 13646.8 | learning rate: 2.467E-06 | global batch size: 16 | lm loss: 8.003874E+00 | loss scale: 8192.0 | grad norm: 37645.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 557/ 159576 | consumed samples: 8912 | elapsed time per iteration (ms): 13958.8 | learning rate: 2.472E-06 | global batch size: 16 | lm loss: 7.782984E+00 | loss scale: 8192.0 | grad norm: 61655.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 558/ 159576 | consumed samples: 8928 | elapsed time per iteration (ms): 13955.4 | learning rate: 2.476E-06 | global batch size: 16 | lm loss: 7.811559E+00 | loss scale: 8192.0 | grad norm: 48428.452 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 559/ 159576 | consumed samples: 8944 | elapsed time per iteration (ms): 14457.4 | learning rate: 2.481E-06 | global batch size: 16 | lm loss: 7.931767E+00 | loss scale: 8192.0 | grad norm: 38443.785 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 560/ 159576 | consumed samples: 8960 | elapsed time per iteration (ms): 13823.4 | learning rate: 2.485E-06 | global batch size: 16 | lm loss: 7.793911E+00 | loss scale: 8192.0 | grad norm: 40207.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 561/ 159576 | consumed samples: 8976 | elapsed time per iteration (ms): 13982.4 | learning rate: 2.490E-06 | global batch size: 16 | lm loss: 7.842747E+00 | loss scale: 8192.0 | grad norm: 36711.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 562/ 159576 | consumed samples: 8992 | elapsed time per iteration (ms): 14372.1 | learning rate: 2.494E-06 | global batch size: 16 | lm loss: 7.878882E+00 | loss scale: 8192.0 | grad norm: 54306.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 563/ 159576 | consumed samples: 9008 | elapsed time per iteration (ms): 13678.7 | learning rate: 2.499E-06 | global batch size: 16 | lm loss: 7.849220E+00 | loss scale: 8192.0 | grad norm: 37543.010 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 564/ 159576 | consumed samples: 9024 | elapsed time per iteration (ms): 14069.8 | learning rate: 2.503E-06 | global batch size: 16 | lm loss: 7.844311E+00 | loss scale: 8192.0 | grad norm: 44716.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 565/ 159576 | consumed samples: 9040 | elapsed time per iteration (ms): 13957.6 | learning rate: 2.507E-06 | global batch size: 16 | lm loss: 7.913968E+00 | loss scale: 8192.0 | grad norm: 47566.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 566/ 159576 | consumed samples: 9056 | elapsed time per iteration (ms): 14044.6 | learning rate: 2.512E-06 | global batch size: 16 | lm loss: 7.683057E+00 | loss scale: 8192.0 | grad norm: 46568.215 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 567/ 159576 | consumed samples: 9072 | elapsed time per iteration (ms): 13881.5 | learning rate: 2.516E-06 | global batch size: 16 | lm loss: 7.870160E+00 | loss scale: 8192.0 | grad norm: 41402.594 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 568/ 159576 | consumed samples: 9088 | elapsed time per iteration (ms): 14311.0 | learning rate: 2.521E-06 | global batch size: 16 | lm loss: 7.629350E+00 | loss scale: 8192.0 | grad norm: 39843.869 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 569/ 159576 | consumed samples: 9104 | elapsed time per iteration (ms): 14124.8 | learning rate: 2.525E-06 | global batch size: 16 | lm loss: 7.845489E+00 | loss scale: 8192.0 | grad norm: 47458.318 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 570/ 159576 | consumed samples: 9120 | elapsed time per iteration (ms): 13702.3 | learning rate: 2.530E-06 | global batch size: 16 | lm loss: 7.848298E+00 | loss scale: 8192.0 | grad norm: 53032.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 571/ 159576 | consumed samples: 9136 | elapsed time per iteration (ms): 13866.4 | learning rate: 2.534E-06 | global batch size: 16 | lm loss: 7.659620E+00 | loss scale: 8192.0 | grad norm: 37376.686 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 572/ 159576 | consumed samples: 9152 | elapsed time per iteration (ms): 14443.8 | learning rate: 2.538E-06 | global batch size: 16 | lm loss: 7.711428E+00 | loss scale: 8192.0 | grad norm: 36846.713 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 573/ 159576 | consumed samples: 9168 | elapsed time per iteration (ms): 13723.1 | learning rate: 2.543E-06 | global batch size: 16 | lm loss: 7.800463E+00 | loss scale: 8192.0 | grad norm: 40022.109 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 574/ 159576 | consumed samples: 9184 | elapsed time per iteration (ms): 13313.2 | learning rate: 2.547E-06 | global batch size: 16 | lm loss: 7.722570E+00 | loss scale: 8192.0 | grad norm: 57675.937 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 575/ 159576 | consumed samples: 9200 | elapsed time per iteration (ms): 13533.3 | learning rate: 2.552E-06 | global batch size: 16 | lm loss: 7.797169E+00 | loss scale: 8192.0 | grad norm: 44067.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 576/ 159576 | consumed samples: 9216 | elapsed time per iteration (ms): 13750.6 | learning rate: 2.556E-06 | global batch size: 16 | lm loss: 7.624088E+00 | loss scale: 8192.0 | grad norm: 37579.519 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 577/ 159576 | consumed samples: 9232 | elapsed time per iteration (ms): 14117.7 | learning rate: 2.561E-06 | global batch size: 16 | lm loss: 7.644238E+00 | loss scale: 8192.0 | grad norm: 57135.338 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 578/ 159576 | consumed samples: 9248 | elapsed time per iteration (ms): 13229.4 | learning rate: 2.565E-06 | global batch size: 16 | lm loss: 7.769429E+00 | loss scale: 8192.0 | grad norm: 45266.144 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 579/ 159576 | consumed samples: 9264 | elapsed time per iteration (ms): 13610.6 | learning rate: 2.570E-06 | global batch size: 16 | lm loss: 7.508770E+00 | loss scale: 8192.0 | grad norm: 35604.839 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 580/ 159576 | consumed samples: 9280 | elapsed time per iteration (ms): 13468.6 | learning rate: 2.574E-06 | global batch size: 16 | lm loss: 7.727168E+00 | loss scale: 8192.0 | grad norm: 37920.954 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 581/ 159576 | consumed samples: 9296 | elapsed time per iteration (ms): 14350.0 | learning rate: 2.578E-06 | global batch size: 16 | lm loss: 7.883451E+00 | loss scale: 8192.0 | grad norm: 46515.319 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 582/ 159576 | consumed samples: 9312 | elapsed time per iteration (ms): 13963.5 | learning rate: 2.583E-06 | global batch size: 16 | lm loss: 7.781512E+00 | loss scale: 8192.0 | grad norm: 50170.474 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 583/ 159576 | consumed samples: 9328 | elapsed time per iteration (ms): 13557.9 | learning rate: 2.587E-06 | global batch size: 16 | lm loss: 7.964473E+00 | loss scale: 8192.0 | grad norm: 29593.283 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 584/ 159576 | consumed samples: 9344 | elapsed time per iteration (ms): 13684.8 | learning rate: 2.592E-06 | global batch size: 16 | lm loss: 7.855813E+00 | loss scale: 8192.0 | grad norm: 39619.717 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 585/ 159576 | consumed samples: 9360 | elapsed time per iteration (ms): 13900.2 | learning rate: 2.596E-06 | global batch size: 16 | lm loss: 7.877661E+00 | loss scale: 8192.0 | grad norm: 31203.205 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 586/ 159576 | consumed samples: 9376 | elapsed time per iteration (ms): 13512.1 | learning rate: 2.601E-06 | global batch size: 16 | lm loss: 7.887114E+00 | loss scale: 8192.0 | grad norm: 63261.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 587/ 159576 | consumed samples: 9392 | elapsed time per iteration (ms): 13501.8 | learning rate: 2.605E-06 | global batch size: 16 | lm loss: 7.815706E+00 | loss scale: 8192.0 | grad norm: 47655.867 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 588/ 159576 | consumed samples: 9408 | elapsed time per iteration (ms): 13350.5 | learning rate: 2.609E-06 | global batch size: 16 | lm loss: 7.754656E+00 | loss scale: 8192.0 | grad norm: 49073.965 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 589/ 159576 | consumed samples: 9424 | elapsed time per iteration (ms): 13532.4 | learning rate: 2.614E-06 | global batch size: 16 | lm loss: 7.622519E+00 | loss scale: 8192.0 | grad norm: 39015.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 590/ 159576 | consumed samples: 9440 | elapsed time per iteration (ms): 13725.1 | learning rate: 2.618E-06 | global batch size: 16 | lm loss: 7.841989E+00 | loss scale: 8192.0 | grad norm: 59373.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 591/ 159576 | consumed samples: 9456 | elapsed time per iteration (ms): 13818.0 | learning rate: 2.623E-06 | global batch size: 16 | lm loss: 7.730304E+00 | loss scale: 8192.0 | grad norm: 56512.310 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 592/ 159576 | consumed samples: 9472 | elapsed time per iteration (ms): 13289.0 | learning rate: 2.627E-06 | global batch size: 16 | lm loss: 7.849043E+00 | loss scale: 8192.0 | grad norm: 44031.624 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 593/ 159576 | consumed samples: 9488 | elapsed time per iteration (ms): 13614.5 | learning rate: 2.632E-06 | global batch size: 16 | lm loss: 7.807899E+00 | loss scale: 8192.0 | grad norm: 43332.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 594/ 159576 | consumed samples: 9504 | elapsed time per iteration (ms): 14163.8 | learning rate: 2.636E-06 | global batch size: 16 | lm loss: 7.765454E+00 | loss scale: 8192.0 | grad norm: 57221.926 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 595/ 159576 | consumed samples: 9520 | elapsed time per iteration (ms): 13156.1 | learning rate: 2.641E-06 | global batch size: 16 | lm loss: 7.647946E+00 | loss scale: 8192.0 | grad norm: 61799.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 596/ 159576 | consumed samples: 9536 | elapsed time per iteration (ms): 13612.4 | learning rate: 2.645E-06 | global batch size: 16 | lm loss: 7.788985E+00 | loss scale: 8192.0 | grad norm: 47569.358 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 597/ 159576 | consumed samples: 9552 | elapsed time per iteration (ms): 13614.3 | learning rate: 2.649E-06 | global batch size: 16 | lm loss: 7.796825E+00 | loss scale: 8192.0 | grad norm: 34793.812 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 598/ 159576 | consumed samples: 9568 | elapsed time per iteration (ms): 13701.2 | learning rate: 2.654E-06 | global batch size: 16 | lm loss: 7.797745E+00 | loss scale: 8192.0 | grad norm: 78279.259 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 599/ 159576 | consumed samples: 9584 | elapsed time per iteration (ms): 13638.2 | learning rate: 2.658E-06 | global batch size: 16 | lm loss: 7.724266E+00 | loss scale: 8192.0 | grad norm: 52804.639 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 600/ 159576 | consumed samples: 9600 | elapsed time per iteration (ms): 13579.9 | learning rate: 2.663E-06 | global batch size: 16 | lm loss: 7.820310E+00 | loss scale: 8192.0 | grad norm: 37266.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 601/ 159576 | consumed samples: 9616 | elapsed time per iteration (ms): 13865.9 | learning rate: 2.667E-06 | global batch size: 16 | lm loss: 7.770097E+00 | loss scale: 8192.0 | grad norm: 35207.333 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 602/ 159576 | consumed samples: 9632 | elapsed time per iteration (ms): 13180.7 | learning rate: 2.672E-06 | global batch size: 16 | lm loss: 7.816167E+00 | loss scale: 8192.0 | grad norm: 38744.019 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 603/ 159576 | consumed samples: 9648 | elapsed time per iteration (ms): 13931.1 | learning rate: 2.676E-06 | global batch size: 16 | lm loss: 7.817324E+00 | loss scale: 8192.0 | grad norm: 36573.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 604/ 159576 | consumed samples: 9664 | elapsed time per iteration (ms): 13626.6 | learning rate: 2.680E-06 | global batch size: 16 | lm loss: 7.730925E+00 | loss scale: 8192.0 | grad norm: 34465.028 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 605/ 159576 | consumed samples: 9680 | elapsed time per iteration (ms): 13615.1 | learning rate: 2.685E-06 | global batch size: 16 | lm loss: 7.862791E+00 | loss scale: 8192.0 | grad norm: 36177.270 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 606/ 159576 | consumed samples: 9696 | elapsed time per iteration (ms): 13496.6 | learning rate: 2.689E-06 | global batch size: 16 | lm loss: 7.773019E+00 | loss scale: 8192.0 | grad norm: 41679.512 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 607/ 159576 | consumed samples: 9712 | elapsed time per iteration (ms): 14055.9 | learning rate: 2.694E-06 | global batch size: 16 | lm loss: 7.785677E+00 | loss scale: 8192.0 | grad norm: 37271.202 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 608/ 159576 | consumed samples: 9728 | elapsed time per iteration (ms): 13879.6 | learning rate: 2.698E-06 | global batch size: 16 | lm loss: 7.825086E+00 | loss scale: 8192.0 | grad norm: 47809.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 609/ 159576 | consumed samples: 9744 | elapsed time per iteration (ms): 13552.3 | learning rate: 2.703E-06 | global batch size: 16 | lm loss: 7.740236E+00 | loss scale: 8192.0 | grad norm: 52434.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 610/ 159576 | consumed samples: 9760 | elapsed time per iteration (ms): 13176.0 | learning rate: 2.707E-06 | global batch size: 16 | lm loss: 7.737531E+00 | loss scale: 8192.0 | grad norm: 48525.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 611/ 159576 | consumed samples: 9776 | elapsed time per iteration (ms): 13593.3 | learning rate: 2.712E-06 | global batch size: 16 | lm loss: 7.592016E+00 | loss scale: 8192.0 | grad norm: 43005.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 612/ 159576 | consumed samples: 9792 | elapsed time per iteration (ms): 13859.6 | learning rate: 2.716E-06 | global batch size: 16 | lm loss: 7.717112E+00 | loss scale: 8192.0 | grad norm: 39297.786 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 613/ 159576 | consumed samples: 9808 | elapsed time per iteration (ms): 13457.1 | learning rate: 2.720E-06 | global batch size: 16 | lm loss: 7.876259E+00 | loss scale: 8192.0 | grad norm: 46784.787 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 614/ 159576 | consumed samples: 9824 | elapsed time per iteration (ms): 13891.1 | learning rate: 2.725E-06 | global batch size: 16 | lm loss: 7.783233E+00 | loss scale: 8192.0 | grad norm: 55950.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 615/ 159576 | consumed samples: 9840 | elapsed time per iteration (ms): 13986.9 | learning rate: 2.729E-06 | global batch size: 16 | lm loss: 7.671467E+00 | loss scale: 8192.0 | grad norm: 37634.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 616/ 159576 | consumed samples: 9856 | elapsed time per iteration (ms): 14382.5 | learning rate: 2.734E-06 | global batch size: 16 | lm loss: 7.716076E+00 | loss scale: 8192.0 | grad norm: 39465.766 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 617/ 159576 | consumed samples: 9872 | elapsed time per iteration (ms): 13446.9 | learning rate: 2.738E-06 | global batch size: 16 | lm loss: 7.701165E+00 | loss scale: 8192.0 | grad norm: 33600.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 618/ 159576 | consumed samples: 9888 | elapsed time per iteration (ms): 13921.0 | learning rate: 2.743E-06 | global batch size: 16 | lm loss: 7.846385E+00 | loss scale: 8192.0 | grad norm: 34178.825 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 619/ 159576 | consumed samples: 9904 | elapsed time per iteration (ms): 13866.6 | learning rate: 2.747E-06 | global batch size: 16 | lm loss: 7.788978E+00 | loss scale: 8192.0 | grad norm: 39840.427 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 620/ 159576 | consumed samples: 9920 | elapsed time per iteration (ms): 14194.3 | learning rate: 2.751E-06 | global batch size: 16 | lm loss: 7.718859E+00 | loss scale: 8192.0 | grad norm: 35668.255 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 621/ 159576 | consumed samples: 9936 | elapsed time per iteration (ms): 14052.1 | learning rate: 2.756E-06 | global batch size: 16 | lm loss: 7.815299E+00 | loss scale: 8192.0 | grad norm: 65082.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 622/ 159576 | consumed samples: 9952 | elapsed time per iteration (ms): 13986.4 | learning rate: 2.760E-06 | global batch size: 16 | lm loss: 7.647432E+00 | loss scale: 8192.0 | grad norm: 30577.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 623/ 159576 | consumed samples: 9968 | elapsed time per iteration (ms): 14070.1 | learning rate: 2.765E-06 | global batch size: 16 | lm loss: 7.470105E+00 | loss scale: 8192.0 | grad norm: 49150.823 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 624/ 159576 | consumed samples: 9984 | elapsed time per iteration (ms): 13591.8 | learning rate: 2.769E-06 | global batch size: 16 | lm loss: 7.751683E+00 | loss scale: 8192.0 | grad norm: 37773.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 625/ 159576 | consumed samples: 10000 | elapsed time per iteration (ms): 14109.1 | learning rate: 2.774E-06 | global batch size: 16 | lm loss: 7.850559E+00 | loss scale: 8192.0 | grad norm: 49716.008 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 626/ 159576 | consumed samples: 10016 | elapsed time per iteration (ms): 13883.7 | learning rate: 2.778E-06 | global batch size: 16 | lm loss: 7.761450E+00 | loss scale: 8192.0 | grad norm: 40472.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 627/ 159576 | consumed samples: 10032 | elapsed time per iteration (ms): 13871.1 | learning rate: 2.783E-06 | global batch size: 16 | lm loss: 7.638558E+00 | loss scale: 8192.0 | grad norm: 32194.907 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 628/ 159576 | consumed samples: 10048 | elapsed time per iteration (ms): 14009.2 | learning rate: 2.787E-06 | global batch size: 16 | lm loss: 7.602344E+00 | loss scale: 8192.0 | grad norm: 48067.346 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 629/ 159576 | consumed samples: 10064 | elapsed time per iteration (ms): 14668.1 | learning rate: 2.791E-06 | global batch size: 16 | lm loss: 7.641259E+00 | loss scale: 8192.0 | grad norm: 36222.940 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 630/ 159576 | consumed samples: 10080 | elapsed time per iteration (ms): 13862.3 | learning rate: 2.796E-06 | global batch size: 16 | lm loss: 7.665779E+00 | loss scale: 8192.0 | grad norm: 42515.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 631/ 159576 | consumed samples: 10096 | elapsed time per iteration (ms): 13588.5 | learning rate: 2.800E-06 | global batch size: 16 | lm loss: 7.754525E+00 | loss scale: 8192.0 | grad norm: 49054.878 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 632/ 159576 | consumed samples: 10112 | elapsed time per iteration (ms): 13844.9 | learning rate: 2.805E-06 | global batch size: 16 | lm loss: 7.774928E+00 | loss scale: 8192.0 | grad norm: 45662.541 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 633/ 159576 | consumed samples: 10128 | elapsed time per iteration (ms): 14341.8 | learning rate: 2.809E-06 | global batch size: 16 | lm loss: 7.554594E+00 | loss scale: 8192.0 | grad norm: 60744.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 634/ 159576 | consumed samples: 10144 | elapsed time per iteration (ms): 13746.1 | learning rate: 2.814E-06 | global batch size: 16 | lm loss: 7.637143E+00 | loss scale: 8192.0 | grad norm: 49330.376 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 635/ 159576 | consumed samples: 10160 | elapsed time per iteration (ms): 13894.5 | learning rate: 2.818E-06 | global batch size: 16 | lm loss: 7.983640E+00 | loss scale: 8192.0 | grad norm: 49417.095 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 636/ 159576 | consumed samples: 10176 | elapsed time per iteration (ms): 14194.7 | learning rate: 2.822E-06 | global batch size: 16 | lm loss: 7.681066E+00 | loss scale: 8192.0 | grad norm: 61468.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 637/ 159576 | consumed samples: 10192 | elapsed time per iteration (ms): 13961.2 | learning rate: 2.827E-06 | global batch size: 16 | lm loss: 7.862648E+00 | loss scale: 8192.0 | grad norm: 72192.162 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 638/ 159576 | consumed samples: 10208 | elapsed time per iteration (ms): 13647.5 | learning rate: 2.831E-06 | global batch size: 16 | lm loss: 7.569575E+00 | loss scale: 8192.0 | grad norm: 45669.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 639/ 159576 | consumed samples: 10224 | elapsed time per iteration (ms): 13856.5 | learning rate: 2.836E-06 | global batch size: 16 | lm loss: 7.844266E+00 | loss scale: 8192.0 | grad norm: 36677.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 640/ 159576 | consumed samples: 10240 | elapsed time per iteration (ms): 14073.9 | learning rate: 2.840E-06 | global batch size: 16 | lm loss: 7.845327E+00 | loss scale: 8192.0 | grad norm: 96907.467 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 641/ 159576 | consumed samples: 10256 | elapsed time per iteration (ms): 13796.2 | learning rate: 2.845E-06 | global batch size: 16 | lm loss: 7.647357E+00 | loss scale: 8192.0 | grad norm: 57700.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 642/ 159576 | consumed samples: 10272 | elapsed time per iteration (ms): 14118.9 | learning rate: 2.849E-06 | global batch size: 16 | lm loss: 7.207680E+00 | loss scale: 8192.0 | grad norm: 51064.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 643/ 159576 | consumed samples: 10288 | elapsed time per iteration (ms): 14102.7 | learning rate: 2.854E-06 | global batch size: 16 | lm loss: 7.651158E+00 | loss scale: 8192.0 | grad norm: 42382.351 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 644/ 159576 | consumed samples: 10304 | elapsed time per iteration (ms): 14051.2 | learning rate: 2.858E-06 | global batch size: 16 | lm loss: 7.854011E+00 | loss scale: 8192.0 | grad norm: 91247.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 645/ 159576 | consumed samples: 10320 | elapsed time per iteration (ms): 13538.9 | learning rate: 2.862E-06 | global batch size: 16 | lm loss: 7.769484E+00 | loss scale: 8192.0 | grad norm: 69652.208 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 646/ 159576 | consumed samples: 10336 | elapsed time per iteration (ms): 14249.0 | learning rate: 2.867E-06 | global batch size: 16 | lm loss: 7.553013E+00 | loss scale: 8192.0 | grad norm: 51636.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 647/ 159576 | consumed samples: 10352 | elapsed time per iteration (ms): 13970.2 | learning rate: 2.871E-06 | global batch size: 16 | lm loss: 8.084120E+00 | loss scale: 8192.0 | grad norm: 43277.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 648/ 159576 | consumed samples: 10368 | elapsed time per iteration (ms): 13853.5 | learning rate: 2.876E-06 | global batch size: 16 | lm loss: 7.727980E+00 | loss scale: 8192.0 | grad norm: 61582.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 649/ 159576 | consumed samples: 10384 | elapsed time per iteration (ms): 13732.7 | learning rate: 2.880E-06 | global batch size: 16 | lm loss: 8.087885E+00 | loss scale: 8192.0 | grad norm: 80675.460 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 650/ 159576 | consumed samples: 10400 | elapsed time per iteration (ms): 14065.0 | learning rate: 2.885E-06 | global batch size: 16 | lm loss: 7.735159E+00 | loss scale: 8192.0 | grad norm: 57826.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 651/ 159576 | consumed samples: 10416 | elapsed time per iteration (ms): 14427.2 | learning rate: 2.889E-06 | global batch size: 16 | lm loss: 7.631308E+00 | loss scale: 8192.0 | grad norm: 36267.499 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 652/ 159576 | consumed samples: 10432 | elapsed time per iteration (ms): 13615.7 | learning rate: 2.893E-06 | global batch size: 16 | lm loss: 7.756464E+00 | loss scale: 8192.0 | grad norm: 90673.943 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 653/ 159576 | consumed samples: 10448 | elapsed time per iteration (ms): 13935.6 | learning rate: 2.898E-06 | global batch size: 16 | lm loss: 7.687772E+00 | loss scale: 8192.0 | grad norm: 73567.241 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 654/ 159576 | consumed samples: 10464 | elapsed time per iteration (ms): 14106.4 | learning rate: 2.902E-06 | global batch size: 16 | lm loss: 7.805472E+00 | loss scale: 8192.0 | grad norm: 43212.657 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 655/ 159576 | consumed samples: 10480 | elapsed time per iteration (ms): 13870.0 | learning rate: 2.907E-06 | global batch size: 16 | lm loss: 7.733329E+00 | loss scale: 8192.0 | grad norm: 42721.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 656/ 159576 | consumed samples: 10496 | elapsed time per iteration (ms): 13912.1 | learning rate: 2.911E-06 | global batch size: 16 | lm loss: 7.764544E+00 | loss scale: 8192.0 | grad norm: 95237.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 657/ 159576 | consumed samples: 10512 | elapsed time per iteration (ms): 13959.6 | learning rate: 2.916E-06 | global batch size: 16 | lm loss: 7.873410E+00 | loss scale: 8192.0 | grad norm: 58039.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 658/ 159576 | consumed samples: 10528 | elapsed time per iteration (ms): 14236.4 | learning rate: 2.920E-06 | global batch size: 16 | lm loss: 7.776018E+00 | loss scale: 8192.0 | grad norm: 47844.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 659/ 159576 | consumed samples: 10544 | elapsed time per iteration (ms): 14055.2 | learning rate: 2.925E-06 | global batch size: 16 | lm loss: 7.913632E+00 | loss scale: 8192.0 | grad norm: 52680.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 660/ 159576 | consumed samples: 10560 | elapsed time per iteration (ms): 13952.7 | learning rate: 2.929E-06 | global batch size: 16 | lm loss: 7.682195E+00 | loss scale: 8192.0 | grad norm: 43818.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 661/ 159576 | consumed samples: 10576 | elapsed time per iteration (ms): 14150.0 | learning rate: 2.933E-06 | global batch size: 16 | lm loss: 7.787490E+00 | loss scale: 8192.0 | grad norm: 79352.333 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 662/ 159576 | consumed samples: 10592 | elapsed time per iteration (ms): 13865.0 | learning rate: 2.938E-06 | global batch size: 16 | lm loss: 7.774850E+00 | loss scale: 8192.0 | grad norm: 38730.216 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 663/ 159576 | consumed samples: 10608 | elapsed time per iteration (ms): 14161.1 | learning rate: 2.942E-06 | global batch size: 16 | lm loss: 7.580084E+00 | loss scale: 8192.0 | grad norm: 41013.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 664/ 159576 | consumed samples: 10624 | elapsed time per iteration (ms): 13917.2 | learning rate: 2.947E-06 | global batch size: 16 | lm loss: 7.885849E+00 | loss scale: 8192.0 | grad norm: 52940.997 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 665/ 159576 | consumed samples: 10640 | elapsed time per iteration (ms): 14187.3 | learning rate: 2.951E-06 | global batch size: 16 | lm loss: 7.708643E+00 | loss scale: 8192.0 | grad norm: 45471.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 666/ 159576 | consumed samples: 10656 | elapsed time per iteration (ms): 13816.1 | learning rate: 2.956E-06 | global batch size: 16 | lm loss: 7.852731E+00 | loss scale: 8192.0 | grad norm: 34948.074 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 667/ 159576 | consumed samples: 10672 | elapsed time per iteration (ms): 13998.2 | learning rate: 2.960E-06 | global batch size: 16 | lm loss: 7.783283E+00 | loss scale: 8192.0 | grad norm: 72415.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 668/ 159576 | consumed samples: 10688 | elapsed time per iteration (ms): 14355.3 | learning rate: 2.964E-06 | global batch size: 16 | lm loss: 7.606567E+00 | loss scale: 8192.0 | grad norm: 40358.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 669/ 159576 | consumed samples: 10704 | elapsed time per iteration (ms): 13737.0 | learning rate: 2.969E-06 | global batch size: 16 | lm loss: 7.726189E+00 | loss scale: 8192.0 | grad norm: 40258.377 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 670/ 159576 | consumed samples: 10720 | elapsed time per iteration (ms): 13793.7 | learning rate: 2.973E-06 | global batch size: 16 | lm loss: 7.691747E+00 | loss scale: 8192.0 | grad norm: 41826.699 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 671/ 159576 | consumed samples: 10736 | elapsed time per iteration (ms): 13990.9 | learning rate: 2.978E-06 | global batch size: 16 | lm loss: 7.731771E+00 | loss scale: 8192.0 | grad norm: 73683.310 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 672/ 159576 | consumed samples: 10752 | elapsed time per iteration (ms): 14342.7 | learning rate: 2.982E-06 | global batch size: 16 | lm loss: 7.751697E+00 | loss scale: 8192.0 | grad norm: 45162.989 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 673/ 159576 | consumed samples: 10768 | elapsed time per iteration (ms): 14019.6 | learning rate: 2.987E-06 | global batch size: 16 | lm loss: 7.628830E+00 | loss scale: 8192.0 | grad norm: 50354.520 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 674/ 159576 | consumed samples: 10784 | elapsed time per iteration (ms): 13505.9 | learning rate: 2.991E-06 | global batch size: 16 | lm loss: 7.737679E+00 | loss scale: 8192.0 | grad norm: 42630.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 675/ 159576 | consumed samples: 10800 | elapsed time per iteration (ms): 14062.7 | learning rate: 2.996E-06 | global batch size: 16 | lm loss: 7.697219E+00 | loss scale: 8192.0 | grad norm: 74141.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 676/ 159576 | consumed samples: 10816 | elapsed time per iteration (ms): 14348.9 | learning rate: 3.000E-06 | global batch size: 16 | lm loss: 7.685856E+00 | loss scale: 8192.0 | grad norm: 42229.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 677/ 159576 | consumed samples: 10832 | elapsed time per iteration (ms): 13490.6 | learning rate: 3.004E-06 | global batch size: 16 | lm loss: 7.675433E+00 | loss scale: 8192.0 | grad norm: 41266.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 678/ 159576 | consumed samples: 10848 | elapsed time per iteration (ms): 13864.0 | learning rate: 3.009E-06 | global batch size: 16 | lm loss: 7.602362E+00 | loss scale: 8192.0 | grad norm: 28128.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 679/ 159576 | consumed samples: 10864 | elapsed time per iteration (ms): 13876.8 | learning rate: 3.013E-06 | global batch size: 16 | lm loss: 7.921748E+00 | loss scale: 8192.0 | grad norm: 94093.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 680/ 159576 | consumed samples: 10880 | elapsed time per iteration (ms): 14089.6 | learning rate: 3.018E-06 | global batch size: 16 | lm loss: 7.932827E+00 | loss scale: 8192.0 | grad norm: 66492.252 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 681/ 159576 | consumed samples: 10896 | elapsed time per iteration (ms): 13869.3 | learning rate: 3.022E-06 | global batch size: 16 | lm loss: 7.712299E+00 | loss scale: 8192.0 | grad norm: 48293.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 682/ 159576 | consumed samples: 10912 | elapsed time per iteration (ms): 14135.1 | learning rate: 3.027E-06 | global batch size: 16 | lm loss: 7.638190E+00 | loss scale: 8192.0 | grad norm: 38847.818 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 683/ 159576 | consumed samples: 10928 | elapsed time per iteration (ms): 13923.5 | learning rate: 3.031E-06 | global batch size: 16 | lm loss: 7.728378E+00 | loss scale: 8192.0 | grad norm: 145094.985 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 684/ 159576 | consumed samples: 10944 | elapsed time per iteration (ms): 13370.2 | learning rate: 3.036E-06 | global batch size: 16 | lm loss: 7.695971E+00 | loss scale: 8192.0 | grad norm: 72337.161 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 685/ 159576 | consumed samples: 10960 | elapsed time per iteration (ms): 14077.4 | learning rate: 3.040E-06 | global batch size: 16 | lm loss: 7.967864E+00 | loss scale: 8192.0 | grad norm: 60013.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 686/ 159576 | consumed samples: 10976 | elapsed time per iteration (ms): 13866.9 | learning rate: 3.044E-06 | global batch size: 16 | lm loss: 7.790969E+00 | loss scale: 8192.0 | grad norm: 66989.408 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 687/ 159576 | consumed samples: 10992 | elapsed time per iteration (ms): 13994.5 | learning rate: 3.049E-06 | global batch size: 16 | lm loss: 7.558614E+00 | loss scale: 8192.0 | grad norm: 41316.798 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 688/ 159576 | consumed samples: 11008 | elapsed time per iteration (ms): 13732.9 | learning rate: 3.053E-06 | global batch size: 16 | lm loss: 7.831646E+00 | loss scale: 8192.0 | grad norm: 113582.407 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 689/ 159576 | consumed samples: 11024 | elapsed time per iteration (ms): 14223.7 | learning rate: 3.058E-06 | global batch size: 16 | lm loss: 7.934176E+00 | loss scale: 8192.0 | grad norm: 88203.837 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 690/ 159576 | consumed samples: 11040 | elapsed time per iteration (ms): 14149.5 | learning rate: 3.062E-06 | global batch size: 16 | lm loss: 8.017797E+00 | loss scale: 8192.0 | grad norm: 58624.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 691/ 159576 | consumed samples: 11056 | elapsed time per iteration (ms): 13400.2 | learning rate: 3.067E-06 | global batch size: 16 | lm loss: 7.660833E+00 | loss scale: 8192.0 | grad norm: 55959.298 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 692/ 159576 | consumed samples: 11072 | elapsed time per iteration (ms): 13833.8 | learning rate: 3.071E-06 | global batch size: 16 | lm loss: 7.664068E+00 | loss scale: 8192.0 | grad norm: 59276.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 693/ 159576 | consumed samples: 11088 | elapsed time per iteration (ms): 14240.4 | learning rate: 3.075E-06 | global batch size: 16 | lm loss: 7.707018E+00 | loss scale: 8192.0 | grad norm: 93883.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 694/ 159576 | consumed samples: 11104 | elapsed time per iteration (ms): 13875.3 | learning rate: 3.080E-06 | global batch size: 16 | lm loss: 7.786274E+00 | loss scale: 8192.0 | grad norm: 64903.918 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 695/ 159576 | consumed samples: 11120 | elapsed time per iteration (ms): 13813.0 | learning rate: 3.084E-06 | global batch size: 16 | lm loss: 7.512930E+00 | loss scale: 8192.0 | grad norm: 51983.944 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 696/ 159576 | consumed samples: 11136 | elapsed time per iteration (ms): 13976.3 | learning rate: 3.089E-06 | global batch size: 16 | lm loss: 7.692935E+00 | loss scale: 8192.0 | grad norm: 60144.327 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 697/ 159576 | consumed samples: 11152 | elapsed time per iteration (ms): 14241.9 | learning rate: 3.093E-06 | global batch size: 16 | lm loss: 7.665162E+00 | loss scale: 8192.0 | grad norm: 45825.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 698/ 159576 | consumed samples: 11168 | elapsed time per iteration (ms): 13633.7 | learning rate: 3.098E-06 | global batch size: 16 | lm loss: 7.619460E+00 | loss scale: 8192.0 | grad norm: 50817.283 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 699/ 159576 | consumed samples: 11184 | elapsed time per iteration (ms): 13862.8 | learning rate: 3.102E-06 | global batch size: 16 | lm loss: 7.827911E+00 | loss scale: 8192.0 | grad norm: 55475.644 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 700/ 159576 | consumed samples: 11200 | elapsed time per iteration (ms): 13992.4 | learning rate: 3.107E-06 | global batch size: 16 | lm loss: 7.651889E+00 | loss scale: 8192.0 | grad norm: 41255.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 701/ 159576 | consumed samples: 11216 | elapsed time per iteration (ms): 13980.6 | learning rate: 3.111E-06 | global batch size: 16 | lm loss: 7.715150E+00 | loss scale: 8192.0 | grad norm: 54466.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 702/ 159576 | consumed samples: 11232 | elapsed time per iteration (ms): 13968.4 | learning rate: 3.115E-06 | global batch size: 16 | lm loss: 7.782993E+00 | loss scale: 8192.0 | grad norm: 52144.399 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 703/ 159576 | consumed samples: 11248 | elapsed time per iteration (ms): 13960.9 | learning rate: 3.120E-06 | global batch size: 16 | lm loss: 7.681329E+00 | loss scale: 8192.0 | grad norm: 51153.990 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 704/ 159576 | consumed samples: 11264 | elapsed time per iteration (ms): 14082.5 | learning rate: 3.124E-06 | global batch size: 16 | lm loss: 7.697348E+00 | loss scale: 8192.0 | grad norm: 30117.468 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 705/ 159576 | consumed samples: 11280 | elapsed time per iteration (ms): 13980.4 | learning rate: 3.129E-06 | global batch size: 16 | lm loss: 7.733425E+00 | loss scale: 8192.0 | grad norm: 49027.047 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 706/ 159576 | consumed samples: 11296 | elapsed time per iteration (ms): 13865.4 | learning rate: 3.133E-06 | global batch size: 16 | lm loss: 7.844088E+00 | loss scale: 8192.0 | grad norm: 43555.293 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 707/ 159576 | consumed samples: 11312 | elapsed time per iteration (ms): 13817.5 | learning rate: 3.138E-06 | global batch size: 16 | lm loss: 7.752273E+00 | loss scale: 8192.0 | grad norm: 96517.184 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 708/ 159576 | consumed samples: 11328 | elapsed time per iteration (ms): 13958.9 | learning rate: 3.142E-06 | global batch size: 16 | lm loss: 7.757376E+00 | loss scale: 8192.0 | grad norm: 77216.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 709/ 159576 | consumed samples: 11344 | elapsed time per iteration (ms): 13428.3 | learning rate: 3.146E-06 | global batch size: 16 | lm loss: 7.687693E+00 | loss scale: 8192.0 | grad norm: 57064.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 710/ 159576 | consumed samples: 11360 | elapsed time per iteration (ms): 13648.2 | learning rate: 3.151E-06 | global batch size: 16 | lm loss: 7.663705E+00 | loss scale: 8192.0 | grad norm: 50512.811 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 711/ 159576 | consumed samples: 11376 | elapsed time per iteration (ms): 14017.0 | learning rate: 3.155E-06 | global batch size: 16 | lm loss: 7.597622E+00 | loss scale: 8192.0 | grad norm: 52114.282 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 712/ 159576 | consumed samples: 11392 | elapsed time per iteration (ms): 13780.7 | learning rate: 3.160E-06 | global batch size: 16 | lm loss: 7.771480E+00 | loss scale: 8192.0 | grad norm: 169756.868 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 713/ 159576 | consumed samples: 11408 | elapsed time per iteration (ms): 13096.8 | learning rate: 3.164E-06 | global batch size: 16 | lm loss: 7.713109E+00 | loss scale: 8192.0 | grad norm: 87094.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 714/ 159576 | consumed samples: 11424 | elapsed time per iteration (ms): 13743.9 | learning rate: 3.169E-06 | global batch size: 16 | lm loss: 7.749861E+00 | loss scale: 8192.0 | grad norm: 49749.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 715/ 159576 | consumed samples: 11440 | elapsed time per iteration (ms): 14274.0 | learning rate: 3.173E-06 | global batch size: 16 | lm loss: 7.797529E+00 | loss scale: 8192.0 | grad norm: 51932.227 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 716/ 159576 | consumed samples: 11456 | elapsed time per iteration (ms): 13788.8 | learning rate: 3.178E-06 | global batch size: 16 | lm loss: 7.704132E+00 | loss scale: 8192.0 | grad norm: 68478.047 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 717/ 159576 | consumed samples: 11472 | elapsed time per iteration (ms): 13977.5 | learning rate: 3.182E-06 | global batch size: 16 | lm loss: 7.746219E+00 | loss scale: 8192.0 | grad norm: 107770.469 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 718/ 159576 | consumed samples: 11488 | elapsed time per iteration (ms): 13786.8 | learning rate: 3.186E-06 | global batch size: 16 | lm loss: 7.617724E+00 | loss scale: 8192.0 | grad norm: 57419.512 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 719/ 159576 | consumed samples: 11504 | elapsed time per iteration (ms): 14003.5 | learning rate: 3.191E-06 | global batch size: 16 | lm loss: 7.642632E+00 | loss scale: 8192.0 | grad norm: 48000.387 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 720/ 159576 | consumed samples: 11520 | elapsed time per iteration (ms): 13651.1 | learning rate: 3.195E-06 | global batch size: 16 | lm loss: 7.790938E+00 | loss scale: 8192.0 | grad norm: 45384.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 721/ 159576 | consumed samples: 11536 | elapsed time per iteration (ms): 13820.3 | learning rate: 3.200E-06 | global batch size: 16 | lm loss: 7.799318E+00 | loss scale: 8192.0 | grad norm: 94827.685 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 722/ 159576 | consumed samples: 11552 | elapsed time per iteration (ms): 13998.9 | learning rate: 3.204E-06 | global batch size: 16 | lm loss: 7.924202E+00 | loss scale: 8192.0 | grad norm: 106713.536 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 723/ 159576 | consumed samples: 11568 | elapsed time per iteration (ms): 13787.6 | learning rate: 3.209E-06 | global batch size: 16 | lm loss: 7.662113E+00 | loss scale: 8192.0 | grad norm: 53132.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 724/ 159576 | consumed samples: 11584 | elapsed time per iteration (ms): 14003.4 | learning rate: 3.213E-06 | global batch size: 16 | lm loss: 7.735355E+00 | loss scale: 8192.0 | grad norm: 46503.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 725/ 159576 | consumed samples: 11600 | elapsed time per iteration (ms): 14211.4 | learning rate: 3.217E-06 | global batch size: 16 | lm loss: 7.413515E+00 | loss scale: 8192.0 | grad norm: 46300.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 726/ 159576 | consumed samples: 11616 | elapsed time per iteration (ms): 14085.1 | learning rate: 3.222E-06 | global batch size: 16 | lm loss: 7.793005E+00 | loss scale: 8192.0 | grad norm: 123901.591 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 727/ 159576 | consumed samples: 11632 | elapsed time per iteration (ms): 13498.1 | learning rate: 3.226E-06 | global batch size: 16 | lm loss: 7.570110E+00 | loss scale: 8192.0 | grad norm: 110746.235 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 728/ 159576 | consumed samples: 11648 | elapsed time per iteration (ms): 13944.5 | learning rate: 3.231E-06 | global batch size: 16 | lm loss: 7.805285E+00 | loss scale: 8192.0 | grad norm: 54666.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 729/ 159576 | consumed samples: 11664 | elapsed time per iteration (ms): 13478.9 | learning rate: 3.235E-06 | global batch size: 16 | lm loss: 7.702326E+00 | loss scale: 8192.0 | grad norm: 95219.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 730/ 159576 | consumed samples: 11680 | elapsed time per iteration (ms): 13419.9 | learning rate: 3.240E-06 | global batch size: 16 | lm loss: 7.694516E+00 | loss scale: 8192.0 | grad norm: 44428.528 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 731/ 159576 | consumed samples: 11696 | elapsed time per iteration (ms): 13890.7 | learning rate: 3.244E-06 | global batch size: 16 | lm loss: 7.656667E+00 | loss scale: 8192.0 | grad norm: 79142.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 732/ 159576 | consumed samples: 11712 | elapsed time per iteration (ms): 14381.2 | learning rate: 3.249E-06 | global batch size: 16 | lm loss: 7.689932E+00 | loss scale: 8192.0 | grad norm: 69883.450 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 05:03:31] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1162855_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 05:03:31] PULSE: tr8-104B is running for 1:02:26 since 2021-09-24T04:01:05 (1162747 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 733/ 159576 | consumed samples: 11728 | elapsed time per iteration (ms): 13725.2 | learning rate: 3.253E-06 | global batch size: 16 | lm loss: 7.808900E+00 | loss scale: 8192.0 | grad norm: 50692.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 734/ 159576 | consumed samples: 11744 | elapsed time per iteration (ms): 13115.2 | learning rate: 3.257E-06 | global batch size: 16 | lm loss: 7.737029E+00 | loss scale: 8192.0 | grad norm: 69149.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 735/ 159576 | consumed samples: 11760 | elapsed time per iteration (ms): 13493.9 | learning rate: 3.262E-06 | global batch size: 16 | lm loss: 7.630354E+00 | loss scale: 8192.0 | grad norm: 85240.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 736/ 159576 | consumed samples: 11776 | elapsed time per iteration (ms): 13636.0 | learning rate: 3.266E-06 | global batch size: 16 | lm loss: 7.626644E+00 | loss scale: 8192.0 | grad norm: 57646.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 737/ 159576 | consumed samples: 11792 | elapsed time per iteration (ms): 13810.1 | learning rate: 3.271E-06 | global batch size: 16 | lm loss: 7.526936E+00 | loss scale: 8192.0 | grad norm: 95065.076 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 738/ 159576 | consumed samples: 11808 | elapsed time per iteration (ms): 13385.6 | learning rate: 3.275E-06 | global batch size: 16 | lm loss: 7.820796E+00 | loss scale: 8192.0 | grad norm: 113407.272 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 739/ 159576 | consumed samples: 11824 | elapsed time per iteration (ms): 13689.8 | learning rate: 3.280E-06 | global batch size: 16 | lm loss: 7.774467E+00 | loss scale: 8192.0 | grad norm: 98657.078 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 740/ 159576 | consumed samples: 11840 | elapsed time per iteration (ms): 13965.2 | learning rate: 3.284E-06 | global batch size: 16 | lm loss: 7.762564E+00 | loss scale: 8192.0 | grad norm: 71745.217 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 741/ 159576 | consumed samples: 11856 | elapsed time per iteration (ms): 13569.2 | learning rate: 3.288E-06 | global batch size: 16 | lm loss: 7.608281E+00 | loss scale: 8192.0 | grad norm: 40905.544 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 742/ 159576 | consumed samples: 11872 | elapsed time per iteration (ms): 13635.8 | learning rate: 3.293E-06 | global batch size: 16 | lm loss: 7.570668E+00 | loss scale: 8192.0 | grad norm: 80257.423 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 743/ 159576 | consumed samples: 11888 | elapsed time per iteration (ms): 13669.8 | learning rate: 3.297E-06 | global batch size: 16 | lm loss: 7.586653E+00 | loss scale: 8192.0 | grad norm: 56412.186 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 744/ 159576 | consumed samples: 11904 | elapsed time per iteration (ms): 13473.9 | learning rate: 3.302E-06 | global batch size: 16 | lm loss: 7.701398E+00 | loss scale: 8192.0 | grad norm: 100221.753 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 745/ 159576 | consumed samples: 11920 | elapsed time per iteration (ms): 13453.8 | learning rate: 3.306E-06 | global batch size: 16 | lm loss: 7.772648E+00 | loss scale: 8192.0 | grad norm: 88519.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 746/ 159576 | consumed samples: 11936 | elapsed time per iteration (ms): 13732.5 | learning rate: 3.311E-06 | global batch size: 16 | lm loss: 7.940891E+00 | loss scale: 8192.0 | grad norm: 66980.299 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 747/ 159576 | consumed samples: 11952 | elapsed time per iteration (ms): 13956.5 | learning rate: 3.315E-06 | global batch size: 16 | lm loss: 7.879022E+00 | loss scale: 8192.0 | grad norm: 73008.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 748/ 159576 | consumed samples: 11968 | elapsed time per iteration (ms): 13250.5 | learning rate: 3.320E-06 | global batch size: 16 | lm loss: 7.693480E+00 | loss scale: 8192.0 | grad norm: 45346.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 749/ 159576 | consumed samples: 11984 | elapsed time per iteration (ms): 13529.3 | learning rate: 3.324E-06 | global batch size: 16 | lm loss: 7.658270E+00 | loss scale: 8192.0 | grad norm: 156261.718 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 750/ 159576 | consumed samples: 12000 | elapsed time per iteration (ms): 14110.0 | learning rate: 3.328E-06 | global batch size: 16 | lm loss: 7.741945E+00 | loss scale: 8192.0 | grad norm: 121818.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 751/ 159576 | consumed samples: 12016 | elapsed time per iteration (ms): 13463.3 | learning rate: 3.333E-06 | global batch size: 16 | lm loss: 7.631550E+00 | loss scale: 8192.0 | grad norm: 69835.617 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 752/ 159576 | consumed samples: 12032 | elapsed time per iteration (ms): 13424.2 | learning rate: 3.337E-06 | global batch size: 16 | lm loss: 7.669878E+00 | loss scale: 8192.0 | grad norm: 47821.077 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 753/ 159576 | consumed samples: 12048 | elapsed time per iteration (ms): 13566.2 | learning rate: 3.342E-06 | global batch size: 16 | lm loss: 7.567214E+00 | loss scale: 8192.0 | grad norm: 68234.683 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 754/ 159576 | consumed samples: 12064 | elapsed time per iteration (ms): 14065.3 | learning rate: 3.346E-06 | global batch size: 16 | lm loss: 7.753268E+00 | loss scale: 8192.0 | grad norm: 134900.848 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 755/ 159576 | consumed samples: 12080 | elapsed time per iteration (ms): 13518.6 | learning rate: 3.351E-06 | global batch size: 16 | lm loss: 7.552173E+00 | loss scale: 8192.0 | grad norm: 48964.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 756/ 159576 | consumed samples: 12096 | elapsed time per iteration (ms): 13728.7 | learning rate: 3.355E-06 | global batch size: 16 | lm loss: 7.735795E+00 | loss scale: 8192.0 | grad norm: 73204.769 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 757/ 159576 | consumed samples: 12112 | elapsed time per iteration (ms): 14082.3 | learning rate: 3.359E-06 | global batch size: 16 | lm loss: 7.910018E+00 | loss scale: 8192.0 | grad norm: 83429.905 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 758/ 159576 | consumed samples: 12128 | elapsed time per iteration (ms): 13428.5 | learning rate: 3.364E-06 | global batch size: 16 | lm loss: 7.669195E+00 | loss scale: 8192.0 | grad norm: 61137.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 759/ 159576 | consumed samples: 12144 | elapsed time per iteration (ms): 13632.1 | learning rate: 3.368E-06 | global batch size: 16 | lm loss: 7.795278E+00 | loss scale: 8192.0 | grad norm: 59141.292 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 760/ 159576 | consumed samples: 12160 | elapsed time per iteration (ms): 13624.6 | learning rate: 3.373E-06 | global batch size: 16 | lm loss: 7.692988E+00 | loss scale: 8192.0 | grad norm: 104447.460 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 761/ 159576 | consumed samples: 12176 | elapsed time per iteration (ms): 13611.0 | learning rate: 3.377E-06 | global batch size: 16 | lm loss: 7.784515E+00 | loss scale: 8192.0 | grad norm: 51368.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 762/ 159576 | consumed samples: 12192 | elapsed time per iteration (ms): 13558.6 | learning rate: 3.382E-06 | global batch size: 16 | lm loss: 7.582584E+00 | loss scale: 8192.0 | grad norm: 61983.639 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 763/ 159576 | consumed samples: 12208 | elapsed time per iteration (ms): 13793.4 | learning rate: 3.386E-06 | global batch size: 16 | lm loss: 7.743572E+00 | loss scale: 8192.0 | grad norm: 56837.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 764/ 159576 | consumed samples: 12224 | elapsed time per iteration (ms): 13743.7 | learning rate: 3.391E-06 | global batch size: 16 | lm loss: 7.701952E+00 | loss scale: 8192.0 | grad norm: 92476.492 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 765/ 159576 | consumed samples: 12240 | elapsed time per iteration (ms): 13529.8 | learning rate: 3.395E-06 | global batch size: 16 | lm loss: 7.691103E+00 | loss scale: 8192.0 | grad norm: 103276.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 766/ 159576 | consumed samples: 12256 | elapsed time per iteration (ms): 13189.2 | learning rate: 3.399E-06 | global batch size: 16 | lm loss: 7.589336E+00 | loss scale: 8192.0 | grad norm: 54735.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 767/ 159576 | consumed samples: 12272 | elapsed time per iteration (ms): 13483.6 | learning rate: 3.404E-06 | global batch size: 16 | lm loss: 7.717595E+00 | loss scale: 8192.0 | grad norm: 54456.344 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 768/ 159576 | consumed samples: 12288 | elapsed time per iteration (ms): 13780.9 | learning rate: 3.408E-06 | global batch size: 16 | lm loss: 7.852913E+00 | loss scale: 8192.0 | grad norm: 88912.086 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 769/ 159576 | consumed samples: 12304 | elapsed time per iteration (ms): 13724.3 | learning rate: 3.413E-06 | global batch size: 16 | lm loss: 7.716819E+00 | loss scale: 8192.0 | grad norm: 102833.662 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 770/ 159576 | consumed samples: 12320 | elapsed time per iteration (ms): 13377.3 | learning rate: 3.417E-06 | global batch size: 16 | lm loss: 7.597641E+00 | loss scale: 8192.0 | grad norm: 50835.662 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 771/ 159576 | consumed samples: 12336 | elapsed time per iteration (ms): 13692.5 | learning rate: 3.422E-06 | global batch size: 16 | lm loss: 7.478999E+00 | loss scale: 8192.0 | grad norm: 53587.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 772/ 159576 | consumed samples: 12352 | elapsed time per iteration (ms): 14180.5 | learning rate: 3.426E-06 | global batch size: 16 | lm loss: 7.546258E+00 | loss scale: 8192.0 | grad norm: 63294.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 773/ 159576 | consumed samples: 12368 | elapsed time per iteration (ms): 13096.5 | learning rate: 3.430E-06 | global batch size: 16 | lm loss: 7.711743E+00 | loss scale: 8192.0 | grad norm: 99934.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 774/ 159576 | consumed samples: 12384 | elapsed time per iteration (ms): 13520.5 | learning rate: 3.435E-06 | global batch size: 16 | lm loss: 7.645664E+00 | loss scale: 8192.0 | grad norm: 56458.777 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 775/ 159576 | consumed samples: 12400 | elapsed time per iteration (ms): 13630.5 | learning rate: 3.439E-06 | global batch size: 16 | lm loss: 7.603559E+00 | loss scale: 8192.0 | grad norm: 46450.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 776/ 159576 | consumed samples: 12416 | elapsed time per iteration (ms): 14027.6 | learning rate: 3.444E-06 | global batch size: 16 | lm loss: 7.737686E+00 | loss scale: 8192.0 | grad norm: 141770.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 777/ 159576 | consumed samples: 12432 | elapsed time per iteration (ms): 13425.6 | learning rate: 3.448E-06 | global batch size: 16 | lm loss: 7.584914E+00 | loss scale: 8192.0 | grad norm: 124071.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 778/ 159576 | consumed samples: 12448 | elapsed time per iteration (ms): 13642.7 | learning rate: 3.453E-06 | global batch size: 16 | lm loss: 7.606685E+00 | loss scale: 8192.0 | grad norm: 53139.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 779/ 159576 | consumed samples: 12464 | elapsed time per iteration (ms): 13834.1 | learning rate: 3.457E-06 | global batch size: 16 | lm loss: 7.786515E+00 | loss scale: 8192.0 | grad norm: 58657.499 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 780/ 159576 | consumed samples: 12480 | elapsed time per iteration (ms): 13091.5 | learning rate: 3.462E-06 | global batch size: 16 | lm loss: 7.618142E+00 | loss scale: 8192.0 | grad norm: 37881.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 781/ 159576 | consumed samples: 12496 | elapsed time per iteration (ms): 14146.0 | learning rate: 3.466E-06 | global batch size: 16 | lm loss: 7.906812E+00 | loss scale: 8192.0 | grad norm: 114163.942 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 782/ 159576 | consumed samples: 12512 | elapsed time per iteration (ms): 14025.7 | learning rate: 3.470E-06 | global batch size: 16 | lm loss: 7.566094E+00 | loss scale: 8192.0 | grad norm: 46220.333 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 783/ 159576 | consumed samples: 12528 | elapsed time per iteration (ms): 13895.4 | learning rate: 3.475E-06 | global batch size: 16 | lm loss: 7.630446E+00 | loss scale: 8192.0 | grad norm: 64319.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 784/ 159576 | consumed samples: 12544 | elapsed time per iteration (ms): 13890.1 | learning rate: 3.479E-06 | global batch size: 16 | lm loss: 7.692337E+00 | loss scale: 8192.0 | grad norm: 48575.291 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 785/ 159576 | consumed samples: 12560 | elapsed time per iteration (ms): 14156.1 | learning rate: 3.484E-06 | global batch size: 16 | lm loss: 7.736514E+00 | loss scale: 8192.0 | grad norm: 90651.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 786/ 159576 | consumed samples: 12576 | elapsed time per iteration (ms): 14206.7 | learning rate: 3.488E-06 | global batch size: 16 | lm loss: 7.744794E+00 | loss scale: 8192.0 | grad norm: 84355.344 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 787/ 159576 | consumed samples: 12592 | elapsed time per iteration (ms): 13622.2 | learning rate: 3.493E-06 | global batch size: 16 | lm loss: 7.672806E+00 | loss scale: 8192.0 | grad norm: 51705.493 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 788/ 159576 | consumed samples: 12608 | elapsed time per iteration (ms): 13771.2 | learning rate: 3.497E-06 | global batch size: 16 | lm loss: 7.713612E+00 | loss scale: 8192.0 | grad norm: 50748.595 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 789/ 159576 | consumed samples: 12624 | elapsed time per iteration (ms): 14226.1 | learning rate: 3.501E-06 | global batch size: 16 | lm loss: 7.630927E+00 | loss scale: 8192.0 | grad norm: 68226.483 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 790/ 159576 | consumed samples: 12640 | elapsed time per iteration (ms): 14175.2 | learning rate: 3.506E-06 | global batch size: 16 | lm loss: 7.523444E+00 | loss scale: 8192.0 | grad norm: 67731.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 791/ 159576 | consumed samples: 12656 | elapsed time per iteration (ms): 13844.2 | learning rate: 3.510E-06 | global batch size: 16 | lm loss: 7.357096E+00 | loss scale: 8192.0 | grad norm: 45569.401 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 792/ 159576 | consumed samples: 12672 | elapsed time per iteration (ms): 13884.3 | learning rate: 3.515E-06 | global batch size: 16 | lm loss: 7.701885E+00 | loss scale: 8192.0 | grad norm: 53017.231 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 793/ 159576 | consumed samples: 12688 | elapsed time per iteration (ms): 14159.9 | learning rate: 3.519E-06 | global batch size: 16 | lm loss: 7.529918E+00 | loss scale: 8192.0 | grad norm: 55466.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 794/ 159576 | consumed samples: 12704 | elapsed time per iteration (ms): 13975.0 | learning rate: 3.524E-06 | global batch size: 16 | lm loss: 7.684763E+00 | loss scale: 8192.0 | grad norm: 44801.760 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 795/ 159576 | consumed samples: 12720 | elapsed time per iteration (ms): 13769.3 | learning rate: 3.528E-06 | global batch size: 16 | lm loss: 7.843237E+00 | loss scale: 8192.0 | grad norm: 59761.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 796/ 159576 | consumed samples: 12736 | elapsed time per iteration (ms): 13954.1 | learning rate: 3.533E-06 | global batch size: 16 | lm loss: 7.737316E+00 | loss scale: 8192.0 | grad norm: 66240.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 797/ 159576 | consumed samples: 12752 | elapsed time per iteration (ms): 13982.4 | learning rate: 3.537E-06 | global batch size: 16 | lm loss: 7.712746E+00 | loss scale: 8192.0 | grad norm: 53315.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 798/ 159576 | consumed samples: 12768 | elapsed time per iteration (ms): 14164.1 | learning rate: 3.541E-06 | global batch size: 16 | lm loss: 7.649867E+00 | loss scale: 8192.0 | grad norm: 46451.967 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 799/ 159576 | consumed samples: 12784 | elapsed time per iteration (ms): 14010.0 | learning rate: 3.546E-06 | global batch size: 16 | lm loss: 7.833376E+00 | loss scale: 8192.0 | grad norm: 65829.045 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 800/ 159576 | consumed samples: 12800 | elapsed time per iteration (ms): 14307.9 | learning rate: 3.550E-06 | global batch size: 16 | lm loss: 7.790625E+00 | loss scale: 8192.0 | grad norm: 71968.262 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 801/ 159576 | consumed samples: 12816 | elapsed time per iteration (ms): 13972.6 | learning rate: 3.555E-06 | global batch size: 16 | lm loss: 7.611866E+00 | loss scale: 8192.0 | grad norm: 48597.309 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 802/ 159576 | consumed samples: 12832 | elapsed time per iteration (ms): 13959.0 | learning rate: 3.559E-06 | global batch size: 16 | lm loss: 7.617666E+00 | loss scale: 8192.0 | grad norm: 147672.383 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 803/ 159576 | consumed samples: 12848 | elapsed time per iteration (ms): 13806.4 | learning rate: 3.564E-06 | global batch size: 16 | lm loss: 7.813154E+00 | loss scale: 8192.0 | grad norm: 121980.871 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 804/ 159576 | consumed samples: 12864 | elapsed time per iteration (ms): 13949.2 | learning rate: 3.568E-06 | global batch size: 16 | lm loss: 7.654176E+00 | loss scale: 8192.0 | grad norm: 52351.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 805/ 159576 | consumed samples: 12880 | elapsed time per iteration (ms): 13801.9 | learning rate: 3.572E-06 | global batch size: 16 | lm loss: 7.564305E+00 | loss scale: 8192.0 | grad norm: 62792.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 806/ 159576 | consumed samples: 12896 | elapsed time per iteration (ms): 13954.3 | learning rate: 3.577E-06 | global batch size: 16 | lm loss: 7.707185E+00 | loss scale: 8192.0 | grad norm: 64767.398 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 807/ 159576 | consumed samples: 12912 | elapsed time per iteration (ms): 14250.4 | learning rate: 3.581E-06 | global batch size: 16 | lm loss: 7.578569E+00 | loss scale: 8192.0 | grad norm: 73926.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 808/ 159576 | consumed samples: 12928 | elapsed time per iteration (ms): 14201.0 | learning rate: 3.586E-06 | global batch size: 16 | lm loss: 7.631069E+00 | loss scale: 8192.0 | grad norm: 110069.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 809/ 159576 | consumed samples: 12944 | elapsed time per iteration (ms): 13598.4 | learning rate: 3.590E-06 | global batch size: 16 | lm loss: 7.628491E+00 | loss scale: 8192.0 | grad norm: 49670.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 810/ 159576 | consumed samples: 12960 | elapsed time per iteration (ms): 13941.6 | learning rate: 3.595E-06 | global batch size: 16 | lm loss: 7.759563E+00 | loss scale: 8192.0 | grad norm: 45971.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 811/ 159576 | consumed samples: 12976 | elapsed time per iteration (ms): 14298.0 | learning rate: 3.599E-06 | global batch size: 16 | lm loss: 7.502759E+00 | loss scale: 8192.0 | grad norm: 77602.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 812/ 159576 | consumed samples: 12992 | elapsed time per iteration (ms): 13416.1 | learning rate: 3.604E-06 | global batch size: 16 | lm loss: 7.624804E+00 | loss scale: 8192.0 | grad norm: 95989.772 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 813/ 159576 | consumed samples: 13008 | elapsed time per iteration (ms): 13579.1 | learning rate: 3.608E-06 | global batch size: 16 | lm loss: 7.542982E+00 | loss scale: 8192.0 | grad norm: 52064.554 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 814/ 159576 | consumed samples: 13024 | elapsed time per iteration (ms): 14100.2 | learning rate: 3.612E-06 | global batch size: 16 | lm loss: 7.676429E+00 | loss scale: 8192.0 | grad norm: 38221.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 815/ 159576 | consumed samples: 13040 | elapsed time per iteration (ms): 14346.2 | learning rate: 3.617E-06 | global batch size: 16 | lm loss: 7.695131E+00 | loss scale: 8192.0 | grad norm: 57869.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 816/ 159576 | consumed samples: 13056 | elapsed time per iteration (ms): 13771.7 | learning rate: 3.621E-06 | global batch size: 16 | lm loss: 7.578337E+00 | loss scale: 8192.0 | grad norm: 49771.695 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 817/ 159576 | consumed samples: 13072 | elapsed time per iteration (ms): 13776.0 | learning rate: 3.626E-06 | global batch size: 16 | lm loss: 7.583301E+00 | loss scale: 8192.0 | grad norm: 46160.592 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 818/ 159576 | consumed samples: 13088 | elapsed time per iteration (ms): 14040.8 | learning rate: 3.630E-06 | global batch size: 16 | lm loss: 7.773385E+00 | loss scale: 8192.0 | grad norm: 42207.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 819/ 159576 | consumed samples: 13104 | elapsed time per iteration (ms): 13835.3 | learning rate: 3.635E-06 | global batch size: 16 | lm loss: 7.905573E+00 | loss scale: 8192.0 | grad norm: 111883.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 820/ 159576 | consumed samples: 13120 | elapsed time per iteration (ms): 13924.4 | learning rate: 3.639E-06 | global batch size: 16 | lm loss: 7.730550E+00 | loss scale: 8192.0 | grad norm: 75433.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 821/ 159576 | consumed samples: 13136 | elapsed time per iteration (ms): 13915.0 | learning rate: 3.643E-06 | global batch size: 16 | lm loss: 7.688564E+00 | loss scale: 8192.0 | grad norm: 41927.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 822/ 159576 | consumed samples: 13152 | elapsed time per iteration (ms): 13890.4 | learning rate: 3.648E-06 | global batch size: 16 | lm loss: 7.552343E+00 | loss scale: 8192.0 | grad norm: 96543.909 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 823/ 159576 | consumed samples: 13168 | elapsed time per iteration (ms): 13560.6 | learning rate: 3.652E-06 | global batch size: 16 | lm loss: 7.617982E+00 | loss scale: 8192.0 | grad norm: 56370.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 824/ 159576 | consumed samples: 13184 | elapsed time per iteration (ms): 14024.1 | learning rate: 3.657E-06 | global batch size: 16 | lm loss: 7.600199E+00 | loss scale: 8192.0 | grad norm: 61928.907 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 825/ 159576 | consumed samples: 13200 | elapsed time per iteration (ms): 14003.2 | learning rate: 3.661E-06 | global batch size: 16 | lm loss: 7.541789E+00 | loss scale: 8192.0 | grad norm: 56863.341 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 826/ 159576 | consumed samples: 13216 | elapsed time per iteration (ms): 13848.3 | learning rate: 3.666E-06 | global batch size: 16 | lm loss: 7.782004E+00 | loss scale: 8192.0 | grad norm: 59985.533 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 827/ 159576 | consumed samples: 13232 | elapsed time per iteration (ms): 13902.1 | learning rate: 3.670E-06 | global batch size: 16 | lm loss: 7.733065E+00 | loss scale: 8192.0 | grad norm: 39148.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 828/ 159576 | consumed samples: 13248 | elapsed time per iteration (ms): 14356.1 | learning rate: 3.675E-06 | global batch size: 16 | lm loss: 7.625387E+00 | loss scale: 8192.0 | grad norm: 56612.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 829/ 159576 | consumed samples: 13264 | elapsed time per iteration (ms): 14368.0 | learning rate: 3.679E-06 | global batch size: 16 | lm loss: 7.759684E+00 | loss scale: 8192.0 | grad norm: 67635.907 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 830/ 159576 | consumed samples: 13280 | elapsed time per iteration (ms): 13627.9 | learning rate: 3.683E-06 | global batch size: 16 | lm loss: 7.694915E+00 | loss scale: 8192.0 | grad norm: 60776.045 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 831/ 159576 | consumed samples: 13296 | elapsed time per iteration (ms): 13498.1 | learning rate: 3.688E-06 | global batch size: 16 | lm loss: 7.492978E+00 | loss scale: 8192.0 | grad norm: 42000.715 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 832/ 159576 | consumed samples: 13312 | elapsed time per iteration (ms): 13938.9 | learning rate: 3.692E-06 | global batch size: 16 | lm loss: 7.616700E+00 | loss scale: 8192.0 | grad norm: 105579.700 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 833/ 159576 | consumed samples: 13328 | elapsed time per iteration (ms): 13687.8 | learning rate: 3.697E-06 | global batch size: 16 | lm loss: 7.715961E+00 | loss scale: 8192.0 | grad norm: 78119.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 834/ 159576 | consumed samples: 13344 | elapsed time per iteration (ms): 13717.8 | learning rate: 3.701E-06 | global batch size: 16 | lm loss: 7.778497E+00 | loss scale: 8192.0 | grad norm: 58326.728 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 835/ 159576 | consumed samples: 13360 | elapsed time per iteration (ms): 13913.9 | learning rate: 3.706E-06 | global batch size: 16 | lm loss: 7.718093E+00 | loss scale: 8192.0 | grad norm: 48122.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 836/ 159576 | consumed samples: 13376 | elapsed time per iteration (ms): 14318.5 | learning rate: 3.710E-06 | global batch size: 16 | lm loss: 7.521303E+00 | loss scale: 8192.0 | grad norm: 60082.150 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 837/ 159576 | consumed samples: 13392 | elapsed time per iteration (ms): 13780.0 | learning rate: 3.714E-06 | global batch size: 16 | lm loss: 7.538383E+00 | loss scale: 8192.0 | grad norm: 61043.143 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 838/ 159576 | consumed samples: 13408 | elapsed time per iteration (ms): 13961.2 | learning rate: 3.719E-06 | global batch size: 16 | lm loss: 7.548276E+00 | loss scale: 8192.0 | grad norm: 58423.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 839/ 159576 | consumed samples: 13424 | elapsed time per iteration (ms): 14239.6 | learning rate: 3.723E-06 | global batch size: 16 | lm loss: 7.618182E+00 | loss scale: 8192.0 | grad norm: 48500.077 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 840/ 159576 | consumed samples: 13440 | elapsed time per iteration (ms): 13752.3 | learning rate: 3.728E-06 | global batch size: 16 | lm loss: 7.595082E+00 | loss scale: 8192.0 | grad norm: 50825.625 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 841/ 159576 | consumed samples: 13456 | elapsed time per iteration (ms): 14199.3 | learning rate: 3.732E-06 | global batch size: 16 | lm loss: 7.492725E+00 | loss scale: 8192.0 | grad norm: 56977.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 842/ 159576 | consumed samples: 13472 | elapsed time per iteration (ms): 13925.4 | learning rate: 3.737E-06 | global batch size: 16 | lm loss: 7.783816E+00 | loss scale: 8192.0 | grad norm: 40797.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 843/ 159576 | consumed samples: 13488 | elapsed time per iteration (ms): 14119.4 | learning rate: 3.741E-06 | global batch size: 16 | lm loss: 7.606951E+00 | loss scale: 8192.0 | grad norm: 50890.553 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 844/ 159576 | consumed samples: 13504 | elapsed time per iteration (ms): 13941.8 | learning rate: 3.746E-06 | global batch size: 16 | lm loss: 7.638199E+00 | loss scale: 8192.0 | grad norm: 52652.311 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 845/ 159576 | consumed samples: 13520 | elapsed time per iteration (ms): 14424.1 | learning rate: 3.750E-06 | global batch size: 16 | lm loss: 7.555171E+00 | loss scale: 8192.0 | grad norm: 48298.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 846/ 159576 | consumed samples: 13536 | elapsed time per iteration (ms): 14202.9 | learning rate: 3.754E-06 | global batch size: 16 | lm loss: 7.651504E+00 | loss scale: 8192.0 | grad norm: 76618.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 847/ 159576 | consumed samples: 13552 | elapsed time per iteration (ms): 13785.9 | learning rate: 3.759E-06 | global batch size: 16 | lm loss: 7.914087E+00 | loss scale: 8192.0 | grad norm: 40970.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 848/ 159576 | consumed samples: 13568 | elapsed time per iteration (ms): 13892.7 | learning rate: 3.763E-06 | global batch size: 16 | lm loss: 7.714731E+00 | loss scale: 8192.0 | grad norm: 47666.946 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 849/ 159576 | consumed samples: 13584 | elapsed time per iteration (ms): 13608.6 | learning rate: 3.768E-06 | global batch size: 16 | lm loss: 7.566309E+00 | loss scale: 8192.0 | grad norm: 56337.203 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 850/ 159576 | consumed samples: 13600 | elapsed time per iteration (ms): 13752.1 | learning rate: 3.772E-06 | global batch size: 16 | lm loss: 7.621016E+00 | loss scale: 8192.0 | grad norm: 55695.680 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 851/ 159576 | consumed samples: 13616 | elapsed time per iteration (ms): 13514.6 | learning rate: 3.777E-06 | global batch size: 16 | lm loss: 7.510153E+00 | loss scale: 8192.0 | grad norm: 70852.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 852/ 159576 | consumed samples: 13632 | elapsed time per iteration (ms): 13536.1 | learning rate: 3.781E-06 | global batch size: 16 | lm loss: 7.417966E+00 | loss scale: 8192.0 | grad norm: 43169.299 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 853/ 159576 | consumed samples: 13648 | elapsed time per iteration (ms): 14116.4 | learning rate: 3.786E-06 | global batch size: 16 | lm loss: 7.490001E+00 | loss scale: 8192.0 | grad norm: 61980.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 854/ 159576 | consumed samples: 13664 | elapsed time per iteration (ms): 14372.8 | learning rate: 3.790E-06 | global batch size: 16 | lm loss: 7.555287E+00 | loss scale: 8192.0 | grad norm: 43650.333 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 855/ 159576 | consumed samples: 13680 | elapsed time per iteration (ms): 13154.5 | learning rate: 3.794E-06 | global batch size: 16 | lm loss: 7.628311E+00 | loss scale: 8192.0 | grad norm: 32290.729 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 856/ 159576 | consumed samples: 13696 | elapsed time per iteration (ms): 13509.6 | learning rate: 3.799E-06 | global batch size: 16 | lm loss: 7.757495E+00 | loss scale: 8192.0 | grad norm: 94063.051 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 857/ 159576 | consumed samples: 13712 | elapsed time per iteration (ms): 14015.7 | learning rate: 3.803E-06 | global batch size: 16 | lm loss: 7.733263E+00 | loss scale: 8192.0 | grad norm: 53189.090 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 858/ 159576 | consumed samples: 13728 | elapsed time per iteration (ms): 14357.8 | learning rate: 3.808E-06 | global batch size: 16 | lm loss: 7.570580E+00 | loss scale: 8192.0 | grad norm: 57239.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 859/ 159576 | consumed samples: 13744 | elapsed time per iteration (ms): 13954.6 | learning rate: 3.812E-06 | global batch size: 16 | lm loss: 7.593122E+00 | loss scale: 8192.0 | grad norm: 45414.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 860/ 159576 | consumed samples: 13760 | elapsed time per iteration (ms): 14212.3 | learning rate: 3.817E-06 | global batch size: 16 | lm loss: 7.571471E+00 | loss scale: 8192.0 | grad norm: 75659.476 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 861/ 159576 | consumed samples: 13776 | elapsed time per iteration (ms): 14044.0 | learning rate: 3.821E-06 | global batch size: 16 | lm loss: 7.599829E+00 | loss scale: 8192.0 | grad norm: 47651.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 862/ 159576 | consumed samples: 13792 | elapsed time per iteration (ms): 13529.5 | learning rate: 3.825E-06 | global batch size: 16 | lm loss: 7.427186E+00 | loss scale: 8192.0 | grad norm: 76377.661 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 863/ 159576 | consumed samples: 13808 | elapsed time per iteration (ms): 14057.3 | learning rate: 3.830E-06 | global batch size: 16 | lm loss: 7.736305E+00 | loss scale: 8192.0 | grad norm: 76320.820 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 864/ 159576 | consumed samples: 13824 | elapsed time per iteration (ms): 14064.2 | learning rate: 3.834E-06 | global batch size: 16 | lm loss: 7.637553E+00 | loss scale: 8192.0 | grad norm: 56695.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 865/ 159576 | consumed samples: 13840 | elapsed time per iteration (ms): 14009.0 | learning rate: 3.839E-06 | global batch size: 16 | lm loss: 7.709378E+00 | loss scale: 8192.0 | grad norm: 77647.024 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 866/ 159576 | consumed samples: 13856 | elapsed time per iteration (ms): 13951.3 | learning rate: 3.843E-06 | global batch size: 16 | lm loss: 7.856131E+00 | loss scale: 8192.0 | grad norm: 85925.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 867/ 159576 | consumed samples: 13872 | elapsed time per iteration (ms): 14427.4 | learning rate: 3.848E-06 | global batch size: 16 | lm loss: 7.511599E+00 | loss scale: 8192.0 | grad norm: 50353.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 868/ 159576 | consumed samples: 13888 | elapsed time per iteration (ms): 14117.9 | learning rate: 3.852E-06 | global batch size: 16 | lm loss: 7.803133E+00 | loss scale: 8192.0 | grad norm: 73334.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 869/ 159576 | consumed samples: 13904 | elapsed time per iteration (ms): 13519.9 | learning rate: 3.857E-06 | global batch size: 16 | lm loss: 7.515793E+00 | loss scale: 8192.0 | grad norm: 73466.425 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 870/ 159576 | consumed samples: 13920 | elapsed time per iteration (ms): 13901.3 | learning rate: 3.861E-06 | global batch size: 16 | lm loss: 7.841221E+00 | loss scale: 8192.0 | grad norm: 74455.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 871/ 159576 | consumed samples: 13936 | elapsed time per iteration (ms): 14383.8 | learning rate: 3.865E-06 | global batch size: 16 | lm loss: 7.850037E+00 | loss scale: 8192.0 | grad norm: 49579.751 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 872/ 159576 | consumed samples: 13952 | elapsed time per iteration (ms): 14031.3 | learning rate: 3.870E-06 | global batch size: 16 | lm loss: 7.490081E+00 | loss scale: 8192.0 | grad norm: 71074.482 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 873/ 159576 | consumed samples: 13968 | elapsed time per iteration (ms): 13971.5 | learning rate: 3.874E-06 | global batch size: 16 | lm loss: 7.783985E+00 | loss scale: 8192.0 | grad norm: 102193.504 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 874/ 159576 | consumed samples: 13984 | elapsed time per iteration (ms): 14176.3 | learning rate: 3.879E-06 | global batch size: 16 | lm loss: 7.557288E+00 | loss scale: 8192.0 | grad norm: 71546.244 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 875/ 159576 | consumed samples: 14000 | elapsed time per iteration (ms): 14495.9 | learning rate: 3.883E-06 | global batch size: 16 | lm loss: 7.703010E+00 | loss scale: 8192.0 | grad norm: 50279.497 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 876/ 159576 | consumed samples: 14016 | elapsed time per iteration (ms): 13722.6 | learning rate: 3.888E-06 | global batch size: 16 | lm loss: 7.542592E+00 | loss scale: 8192.0 | grad norm: 44841.536 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 877/ 159576 | consumed samples: 14032 | elapsed time per iteration (ms): 13946.5 | learning rate: 3.892E-06 | global batch size: 16 | lm loss: 7.776785E+00 | loss scale: 8192.0 | grad norm: 109756.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 878/ 159576 | consumed samples: 14048 | elapsed time per iteration (ms): 13948.7 | learning rate: 3.896E-06 | global batch size: 16 | lm loss: 7.728590E+00 | loss scale: 8192.0 | grad norm: 70820.820 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 879/ 159576 | consumed samples: 14064 | elapsed time per iteration (ms): 13882.9 | learning rate: 3.901E-06 | global batch size: 16 | lm loss: 7.672616E+00 | loss scale: 8192.0 | grad norm: 44570.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 880/ 159576 | consumed samples: 14080 | elapsed time per iteration (ms): 14042.4 | learning rate: 3.905E-06 | global batch size: 16 | lm loss: 7.680589E+00 | loss scale: 8192.0 | grad norm: 124008.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 881/ 159576 | consumed samples: 14096 | elapsed time per iteration (ms): 13930.7 | learning rate: 3.910E-06 | global batch size: 16 | lm loss: 7.501089E+00 | loss scale: 8192.0 | grad norm: 46056.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 882/ 159576 | consumed samples: 14112 | elapsed time per iteration (ms): 14239.7 | learning rate: 3.914E-06 | global batch size: 16 | lm loss: 7.571886E+00 | loss scale: 8192.0 | grad norm: 66612.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 883/ 159576 | consumed samples: 14128 | elapsed time per iteration (ms): 13486.8 | learning rate: 3.919E-06 | global batch size: 16 | lm loss: 7.536567E+00 | loss scale: 8192.0 | grad norm: 62829.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 884/ 159576 | consumed samples: 14144 | elapsed time per iteration (ms): 14209.0 | learning rate: 3.923E-06 | global batch size: 16 | lm loss: 7.794725E+00 | loss scale: 8192.0 | grad norm: 67729.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 885/ 159576 | consumed samples: 14160 | elapsed time per iteration (ms): 13720.4 | learning rate: 3.928E-06 | global batch size: 16 | lm loss: 7.468060E+00 | loss scale: 8192.0 | grad norm: 44457.501 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 886/ 159576 | consumed samples: 14176 | elapsed time per iteration (ms): 13867.7 | learning rate: 3.932E-06 | global batch size: 16 | lm loss: 7.478938E+00 | loss scale: 8192.0 | grad norm: 45629.682 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 887/ 159576 | consumed samples: 14192 | elapsed time per iteration (ms): 13805.2 | learning rate: 3.936E-06 | global batch size: 16 | lm loss: 7.427522E+00 | loss scale: 8192.0 | grad norm: 59355.003 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 888/ 159576 | consumed samples: 14208 | elapsed time per iteration (ms): 14520.3 | learning rate: 3.941E-06 | global batch size: 16 | lm loss: 7.602240E+00 | loss scale: 8192.0 | grad norm: 45450.350 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 889/ 159576 | consumed samples: 14224 | elapsed time per iteration (ms): 13870.2 | learning rate: 3.945E-06 | global batch size: 16 | lm loss: 7.682034E+00 | loss scale: 8192.0 | grad norm: 51153.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 890/ 159576 | consumed samples: 14240 | elapsed time per iteration (ms): 13708.4 | learning rate: 3.950E-06 | global batch size: 16 | lm loss: 7.558862E+00 | loss scale: 8192.0 | grad norm: 46389.657 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 891/ 159576 | consumed samples: 14256 | elapsed time per iteration (ms): 13645.4 | learning rate: 3.954E-06 | global batch size: 16 | lm loss: 7.527663E+00 | loss scale: 8192.0 | grad norm: 86582.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 892/ 159576 | consumed samples: 14272 | elapsed time per iteration (ms): 13652.2 | learning rate: 3.959E-06 | global batch size: 16 | lm loss: 7.675562E+00 | loss scale: 8192.0 | grad norm: 68924.015 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 893/ 159576 | consumed samples: 14288 | elapsed time per iteration (ms): 14020.9 | learning rate: 3.963E-06 | global batch size: 16 | lm loss: 7.534761E+00 | loss scale: 8192.0 | grad norm: 47359.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 894/ 159576 | consumed samples: 14304 | elapsed time per iteration (ms): 13841.4 | learning rate: 3.967E-06 | global batch size: 16 | lm loss: 7.447322E+00 | loss scale: 8192.0 | grad norm: 51692.050 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 895/ 159576 | consumed samples: 14320 | elapsed time per iteration (ms): 14037.6 | learning rate: 3.972E-06 | global batch size: 16 | lm loss: 7.507210E+00 | loss scale: 8192.0 | grad norm: 64045.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 896/ 159576 | consumed samples: 14336 | elapsed time per iteration (ms): 14109.9 | learning rate: 3.976E-06 | global batch size: 16 | lm loss: 7.523023E+00 | loss scale: 8192.0 | grad norm: 62130.023 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 897/ 159576 | consumed samples: 14352 | elapsed time per iteration (ms): 14567.0 | learning rate: 3.981E-06 | global batch size: 16 | lm loss: 7.609581E+00 | loss scale: 8192.0 | grad norm: 45111.563 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 898/ 159576 | consumed samples: 14368 | elapsed time per iteration (ms): 13613.4 | learning rate: 3.985E-06 | global batch size: 16 | lm loss: 7.677504E+00 | loss scale: 8192.0 | grad norm: 77037.256 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 899/ 159576 | consumed samples: 14384 | elapsed time per iteration (ms): 13889.7 | learning rate: 3.990E-06 | global batch size: 16 | lm loss: 7.463535E+00 | loss scale: 8192.0 | grad norm: 63218.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 900/ 159576 | consumed samples: 14400 | elapsed time per iteration (ms): 13953.1 | learning rate: 3.994E-06 | global batch size: 16 | lm loss: 7.512316E+00 | loss scale: 8192.0 | grad norm: 45889.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 901/ 159576 | consumed samples: 14416 | elapsed time per iteration (ms): 14162.8 | learning rate: 3.999E-06 | global batch size: 16 | lm loss: 7.882708E+00 | loss scale: 8192.0 | grad norm: 42823.467 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 902/ 159576 | consumed samples: 14432 | elapsed time per iteration (ms): 13923.6 | learning rate: 4.003E-06 | global batch size: 16 | lm loss: 7.662213E+00 | loss scale: 8192.0 | grad norm: 61513.464 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 903/ 159576 | consumed samples: 14448 | elapsed time per iteration (ms): 14309.5 | learning rate: 4.007E-06 | global batch size: 16 | lm loss: 7.560106E+00 | loss scale: 8192.0 | grad norm: 69145.911 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 904/ 159576 | consumed samples: 14464 | elapsed time per iteration (ms): 13872.6 | learning rate: 4.012E-06 | global batch size: 16 | lm loss: 7.580536E+00 | loss scale: 8192.0 | grad norm: 50555.734 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 905/ 159576 | consumed samples: 14480 | elapsed time per iteration (ms): 13660.1 | learning rate: 4.016E-06 | global batch size: 16 | lm loss: 7.370582E+00 | loss scale: 8192.0 | grad norm: 58747.890 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 906/ 159576 | consumed samples: 14496 | elapsed time per iteration (ms): 14302.6 | learning rate: 4.021E-06 | global batch size: 16 | lm loss: 7.578561E+00 | loss scale: 8192.0 | grad norm: 51271.016 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 907/ 159576 | consumed samples: 14512 | elapsed time per iteration (ms): 13761.7 | learning rate: 4.025E-06 | global batch size: 16 | lm loss: 7.886317E+00 | loss scale: 8192.0 | grad norm: 103662.947 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 908/ 159576 | consumed samples: 14528 | elapsed time per iteration (ms): 13804.9 | learning rate: 4.030E-06 | global batch size: 16 | lm loss: 7.671743E+00 | loss scale: 8192.0 | grad norm: 73682.928 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 909/ 159576 | consumed samples: 14544 | elapsed time per iteration (ms): 13551.5 | learning rate: 4.034E-06 | global batch size: 16 | lm loss: 7.644366E+00 | loss scale: 8192.0 | grad norm: 44749.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 910/ 159576 | consumed samples: 14560 | elapsed time per iteration (ms): 14145.8 | learning rate: 4.038E-06 | global batch size: 16 | lm loss: 7.575992E+00 | loss scale: 8192.0 | grad norm: 123440.918 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 911/ 159576 | consumed samples: 14576 | elapsed time per iteration (ms): 13697.4 | learning rate: 4.043E-06 | global batch size: 16 | lm loss: 7.622074E+00 | loss scale: 8192.0 | grad norm: 106507.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 912/ 159576 | consumed samples: 14592 | elapsed time per iteration (ms): 13234.0 | learning rate: 4.047E-06 | global batch size: 16 | lm loss: 7.362756E+00 | loss scale: 8192.0 | grad norm: 47407.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 913/ 159576 | consumed samples: 14608 | elapsed time per iteration (ms): 13588.2 | learning rate: 4.052E-06 | global batch size: 16 | lm loss: 7.463619E+00 | loss scale: 8192.0 | grad norm: 52603.656 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 914/ 159576 | consumed samples: 14624 | elapsed time per iteration (ms): 13866.4 | learning rate: 4.056E-06 | global batch size: 16 | lm loss: 7.559254E+00 | loss scale: 8192.0 | grad norm: 75070.449 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 915/ 159576 | consumed samples: 14640 | elapsed time per iteration (ms): 13445.5 | learning rate: 4.061E-06 | global batch size: 16 | lm loss: 7.466935E+00 | loss scale: 8192.0 | grad norm: 84703.653 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 916/ 159576 | consumed samples: 14656 | elapsed time per iteration (ms): 13592.3 | learning rate: 4.065E-06 | global batch size: 16 | lm loss: 7.530110E+00 | loss scale: 8192.0 | grad norm: 68897.329 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 917/ 159576 | consumed samples: 14672 | elapsed time per iteration (ms): 13623.0 | learning rate: 4.070E-06 | global batch size: 16 | lm loss: 7.709665E+00 | loss scale: 8192.0 | grad norm: 42674.546 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 918/ 159576 | consumed samples: 14688 | elapsed time per iteration (ms): 13933.4 | learning rate: 4.074E-06 | global batch size: 16 | lm loss: 7.340624E+00 | loss scale: 8192.0 | grad norm: 62308.866 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 919/ 159576 | consumed samples: 14704 | elapsed time per iteration (ms): 13383.8 | learning rate: 4.078E-06 | global batch size: 16 | lm loss: 7.633225E+00 | loss scale: 8192.0 | grad norm: 101681.252 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 920/ 159576 | consumed samples: 14720 | elapsed time per iteration (ms): 13577.7 | learning rate: 4.083E-06 | global batch size: 16 | lm loss: 7.753546E+00 | loss scale: 8192.0 | grad norm: 64758.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 921/ 159576 | consumed samples: 14736 | elapsed time per iteration (ms): 13615.2 | learning rate: 4.087E-06 | global batch size: 16 | lm loss: 7.587958E+00 | loss scale: 8192.0 | grad norm: 50894.580 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 922/ 159576 | consumed samples: 14752 | elapsed time per iteration (ms): 13349.8 | learning rate: 4.092E-06 | global batch size: 16 | lm loss: 7.769899E+00 | loss scale: 8192.0 | grad norm: 142837.991 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 923/ 159576 | consumed samples: 14768 | elapsed time per iteration (ms): 13909.6 | learning rate: 4.096E-06 | global batch size: 16 | lm loss: 7.624977E+00 | loss scale: 8192.0 | grad norm: 83848.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 924/ 159576 | consumed samples: 14784 | elapsed time per iteration (ms): 13544.9 | learning rate: 4.101E-06 | global batch size: 16 | lm loss: 7.603238E+00 | loss scale: 8192.0 | grad norm: 56820.812 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 925/ 159576 | consumed samples: 14800 | elapsed time per iteration (ms): 14229.7 | learning rate: 4.105E-06 | global batch size: 16 | lm loss: 7.706733E+00 | loss scale: 8192.0 | grad norm: 76791.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 926/ 159576 | consumed samples: 14816 | elapsed time per iteration (ms): 13216.1 | learning rate: 4.109E-06 | global batch size: 16 | lm loss: 7.619715E+00 | loss scale: 8192.0 | grad norm: 71541.361 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 927/ 159576 | consumed samples: 14832 | elapsed time per iteration (ms): 13878.1 | learning rate: 4.114E-06 | global batch size: 16 | lm loss: 7.712871E+00 | loss scale: 8192.0 | grad norm: 73909.646 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 928/ 159576 | consumed samples: 14848 | elapsed time per iteration (ms): 13952.8 | learning rate: 4.118E-06 | global batch size: 16 | lm loss: 7.413386E+00 | loss scale: 8192.0 | grad norm: 57651.288 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 929/ 159576 | consumed samples: 14864 | elapsed time per iteration (ms): 13472.5 | learning rate: 4.123E-06 | global batch size: 16 | lm loss: 7.559020E+00 | loss scale: 8192.0 | grad norm: 91128.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 930/ 159576 | consumed samples: 14880 | elapsed time per iteration (ms): 13393.9 | learning rate: 4.127E-06 | global batch size: 16 | lm loss: 7.636448E+00 | loss scale: 8192.0 | grad norm: 48957.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 931/ 159576 | consumed samples: 14896 | elapsed time per iteration (ms): 13547.0 | learning rate: 4.132E-06 | global batch size: 16 | lm loss: 7.639730E+00 | loss scale: 8192.0 | grad norm: 110788.722 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 932/ 159576 | consumed samples: 14912 | elapsed time per iteration (ms): 14018.3 | learning rate: 4.136E-06 | global batch size: 16 | lm loss: 7.652531E+00 | loss scale: 8192.0 | grad norm: 96359.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 933/ 159576 | consumed samples: 14928 | elapsed time per iteration (ms): 13449.4 | learning rate: 4.141E-06 | global batch size: 16 | lm loss: 7.671719E+00 | loss scale: 8192.0 | grad norm: 60936.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 934/ 159576 | consumed samples: 14944 | elapsed time per iteration (ms): 13624.9 | learning rate: 4.145E-06 | global batch size: 16 | lm loss: 7.672961E+00 | loss scale: 8192.0 | grad norm: 45848.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 935/ 159576 | consumed samples: 14960 | elapsed time per iteration (ms): 13787.5 | learning rate: 4.149E-06 | global batch size: 16 | lm loss: 7.740889E+00 | loss scale: 8192.0 | grad norm: 140359.981 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 936/ 159576 | consumed samples: 14976 | elapsed time per iteration (ms): 13643.3 | learning rate: 4.154E-06 | global batch size: 16 | lm loss: 7.595088E+00 | loss scale: 8192.0 | grad norm: 125926.574 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 937/ 159576 | consumed samples: 14992 | elapsed time per iteration (ms): 13588.2 | learning rate: 4.158E-06 | global batch size: 16 | lm loss: 7.580822E+00 | loss scale: 8192.0 | grad norm: 88915.383 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 938/ 159576 | consumed samples: 15008 | elapsed time per iteration (ms): 13606.3 | learning rate: 4.163E-06 | global batch size: 16 | lm loss: 7.766950E+00 | loss scale: 8192.0 | grad norm: 88671.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 939/ 159576 | consumed samples: 15024 | elapsed time per iteration (ms): 13894.4 | learning rate: 4.167E-06 | global batch size: 16 | lm loss: 7.578055E+00 | loss scale: 8192.0 | grad norm: 66434.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 940/ 159576 | consumed samples: 15040 | elapsed time per iteration (ms): 13885.0 | learning rate: 4.172E-06 | global batch size: 16 | lm loss: 7.837738E+00 | loss scale: 8192.0 | grad norm: 64490.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 941/ 159576 | consumed samples: 15056 | elapsed time per iteration (ms): 14127.9 | learning rate: 4.176E-06 | global batch size: 16 | lm loss: 7.961911E+00 | loss scale: 8192.0 | grad norm: 155493.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 942/ 159576 | consumed samples: 15072 | elapsed time per iteration (ms): 14120.5 | learning rate: 4.180E-06 | global batch size: 16 | lm loss: 7.581886E+00 | loss scale: 8192.0 | grad norm: 84829.182 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 942 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-24 05:51:49,558] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step942/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 942 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 17459.68 -[exiting program after 110.12040019432703 minutes] datetime: 2021-09-24 05:52:01 -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO]ninja ......................... [OKAY][OKAY] - ---------------------------------------------------fused_lamb - .............op name [NO]................ .......installed [OKAY].. - compatible --------------------------------------------------- -cpu_adamsparse_attn ........................... [YES][NO] ...... .......[OKAY] -[OKAY] -transformer ............ [NO] .......fused_adam [OKAY]............. - [NO] ....... stochastic_transformer[OKAY] -. fused_lamb[NO] .................... [NO][OKAY] -....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] ---------------------------------------------------transformer -............ DeepSpeed C++/CUDA extension op report[NO] - .......-------------------------------------------------- -[OKAY]NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninjastochastic_transformer - . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info:torch cuda version ............... -11.1 -nvcc version .....................torch install path 11.2 -...............deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed info - ................... torch version0.4.2+bc17042, bc17042, big-science -....................deepspeed wheel compiled w. 1.8.1...... - torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO] [NO]....... ....... [OKAY][OKAY] - -fused_lamb .............fused_lamb [NO]............. .......[NO] [OKAY]....... - [OKAY] -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -transformertransformer ........................ [NO][NO] .............. [OKAY][OKAY] - -stochastic_transformer stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -fused_adam op name............. ................[NO] installed....... ..[OKAY] -compatible --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] .......fused_adam [OKAY]............. - [NO] .......transformer [OKAY]............ - [NO] .......fused_lamb [OKAY]............. - [NO] ....... [OKAY]stochastic_transformer - . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] ---------------------------------------------------sparse_attn ............ -[NO] DeepSpeed C++/CUDA extension op report....... - [OKAY]-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.transformer - ............-------------------------------------------------- -[NO]JIT compiled ops requires ninja -....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam-------------------------------------------------- -............... op name[YES] ................ ......installed ..[OKAY] -compatible --------------------------------------------------- -fused_adamcpu_adam ............................ [YES][NO] ...... .......[OKAY] -[OKAY] -fused_lamb ............. [NO] ....... fused_adam[OKAY] -............. [NO] ....... [OKAY] -fused_lamb ............. [NO] sparse_attn....... ............[OKAY] -[NO] ....... [OKAY] -transformer ............ [NO] .......sparse_attn [OKAY]............ -[NO] ....... stochastic_transformer[OKAY] -.transformer [NO]............ .......[NO] [OKAY]....... - [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] - ....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ....... [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] - ....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... torch version1.8.1 -.................... torch cuda version1.8.1 -............... torch cuda version11.1 -............... nvcc version11.1 -..................... nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... -deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science -...... deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.cpu_adam - --------------------------------------------------............... - JIT compiled ops requires ninja[YES] - ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... DeepSpeed general environment info:['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. torch install path...... torch 1.8, cuda 11.1............... - ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -fused_adam ............. [NO] ....... [OKAY] -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -fused_lamb ............. [NO] ....... [OKAY] -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1 -torch version ....................torch cuda version 1.8.1............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info - deepspeed install path................... ........... 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - deepspeed info...... torch 1.8, cuda 11.1 - ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninja .................. [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -op name ................ installed .. compatible -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY]-------------------------------------------------- - -fused_lamb DeepSpeed C++/CUDA extension op report............. - [NO]-------------------------------------------------- -.......NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -[OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info:torch cuda version -............... 11.1 -nvcc versiontorch install path ..................... ...............11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed info ...................torch version ....................0.4.2+bc17042, bc17042, big-science -1.8.1deepspeed wheel compiled w. - ......torch cuda version torch 1.8, cuda 11.1............... - 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. quantizer[NO] .............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils .................. quantizer[YES] .................... [NO][OKAY] -....... [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -stochastic_transformer . [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO] ...................... [NO][NO] -....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninja .................................... [OKAY] -[OKAY] --------------------------------------------------- ---------------------------------------------------op name - op name................ installed................ .. installedcompatible -..-------------------------------------------------- -compatible --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam...... [OKAY]............... - [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. fused_lamb[NO] .................... [NO] [OKAY]....... -[OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY]sparse_attn - transformer............ ............[NO] [NO] .............. [OKAY][OKAY] - -transformerstochastic_transformer ............ .[NO] [NO]....... ....... [OKAY][OKAY] - -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -stochastic_transformer . [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ...............-------------------------------------------------- -[YES]op name ...................... [OKAY]installed - .. compatible --------------------------------------------------- -fused_adam ............. cpu_adam[NO] ............... .......[YES] [OKAY]...... -[OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attnfused_lamb ......................... [NO][NO] .............. [OKAY] -[OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ stochastic_transformer[NO] ....... .[OKAY] - [NO] ....... [OKAY]transformer - ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -transformer ............ [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed general environment info: -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -torch version .................... 1.8.1 -stochastic_transformer . [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -torch version .................... 1.8.1 --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -ninja .................. [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -fused_lamb ............. [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -torch install path DeepSpeed general environment info:............... -DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch install path -cpu_adam ............... [YES] ...... [OKAY] - torch install path...............torch version ................................... 1.8.1 -torch cuda version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] ............... -fused_adam ............. [NO] ....... [OKAY] - ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.1 -torch version - nvcc versiontorch version.................... .........................................1.8.1 -11.21.8.1 - -torch cuda versiondeepspeed install path torch cuda version.......................... ...............11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1nvcc version - -fused_lamb ............. [NO] ....... [OKAY] - deepspeed infonvcc version..................... ........................................11.2 -0.4.2+bc17042, bc17042, big-science11.2deepspeed install path - - deepspeed wheel compiled w.deepspeed install path........... ................. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch 1.8, cuda 11.1 - -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. -sparse_attn ............ [NO] ....... [OKAY] - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -ninja .................. [OKAY] --------------------------------------------------- -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -op name ................ installed .. compatible -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - --------------------------------------------------- -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -cpu_adam ............... [YES] ...... [OKAY] -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed general environment info: -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -stochastic_transformer . [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY]ninja - .................. fused_lamb[OKAY] -............. --------------------------------------------------[NO] - .......op name [OKAY]................ - installed .. compatible --------------------------------------------------- -sparse_attn ............cpu_adam [NO]............... .......[YES] [OKAY]...... - [OKAY]transformer - ............ [NO] ....... [OKAY] -stochastic_transformerfused_adam .............. [NO][NO] .............. [OKAY][OKAY] - -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -torch version .................... 1.8.1 -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -utilsutils .................................... [YES] [YES]...... ......[OKAY] -[OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninja .................................... [OKAY][OKAY] - -ninja---------------------------------------------------------------------------------------------------- - -..................op nameop name [OKAY]................ - ................installed-------------------------------------------------- -installed..op name compatible.................. - compatibleinstalled-------------------------------------------------- - -.. --------------------------------------------------compatible - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam ......cpu_adam............... [OKAY]...............[YES] - ......[YES] [OKAY]...... - [OKAY] -fused_adam ............. [NO] ....... fused_adam[OKAY]fused_adam - .......................... [NO]fused_lamb[NO] ........................... [OKAY][OKAY][NO] - - .......fused_lamb fused_lamb [OKAY] ............. -............. [NO][NO] .............. [OKAY][OKAY] - -sparse_attn ............ [NO] .......ninja [OKAY] -sparse_attn..................sparse_attn transformer ............ [OKAY]............ ............ -[NO] [NO] --------------------------------------------------[NO].............. - op name[OKAY].......[OKAY] - -................[OKAY] transformerinstalledtransformer - .. ............ ............ stochastic_transformer[NO]compatible - [NO]--------------------------------------------------....... -[OKAY]........ - [NO][OKAY] -.......cpu_adamstochastic_transformer stochastic_transformer...............[OKAY] . - [YES].[NO] ...... [NO].......[OKAY] ....... -[OKAY] -[OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer_inference .. [NO] ....... [OKAY] -async_io ...............utils [NO].................. .......[YES] [NO]...... - [OKAY] -quantizer .............. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] async_io....... [NO] --------------------------------------------------- -............... [NO] ....... [NO]transformer_inference - .. [NO] ....... [OKAY] -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -quantizer .............. utils[NO] ......................... [YES][OKAY] -...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda versiontorch version ................................... 11.11.8.1 - -nvcc version torch cuda version..................... ...............11.2 -11.1deepspeed install path - nvcc version........... ..................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.2deepspeed info - deepspeed install path................... ...........0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...... deepspeed infotorch 1.8, cuda 11.1 -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO]async_io -DeepSpeed general environment info:torch install path - ............... [NO] ....... [NO] -............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -1.8.1 -transformer_inference .. [NO] ....... [OKAY] -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc version torch cuda version..................... ............... 11.211.1 - -transformer_inference ..utils [NO].................. .......[YES] [OKAY]...... -[OKAY] -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer utils.............. ..................[NO] [YES]....... [OKAY]...... - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.async_io - ............... [NO] ....... [NO] -transformer_inferenceasync_io .. ...............[NO] [NO]....... .......[OKAY] -[NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] ---------------------------------------------------utils -.................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ....... ............... [NO] ....... [NO] - [NO] -transformer_inferencetransformer_inference .... [NO] ....... [OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. quantizer[YES] .................... [NO][OKAY] -....... [OKAY] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed C++/CUDA extension op report -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] - ....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer ............ [NO] ....... [OKAY] -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] ....... .......[OKAY] -stochastic_transformer . [NO] ....... [OKAY] -[OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -fused_lamb ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY]ninja -utils .................. [YES] ...... [OKAY] - transformer.................. ............ [NO][OKAY] ....... - [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name stochastic_transformer................ installed. ..[NO] .......compatible -[OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -.. [NO] ....... utils[OKAY] -async_io ...............async_io [NO] ...................... [NO][NO] -.................. [YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] - ....... [NO] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] -fused_adam-------------------------------------------------- -............. op name[NO] ................ .......ninjainstalled ..[OKAY].................. - compatible[OKAY] -fused_lamb --------------------------------------------------- -.............-------------------------------------------------- -[NO] op name....... ................[OKAY] -cpu_adaminstalled ................. [YES]compatible -...... --------------------------------------------------[OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -transformercpu_adam fused_adam ............ ............... ............. [NO] [YES] [NO] ....... ...... ....... [OKAY] [OKAY] -[OKAY] - -fused_lambstochastic_transformer ............. [NO] ........ [NO]fused_adam[OKAY] -.................... [NO][OKAY] -....... [OKAY] -DeepSpeed general environment info: -fused_lamb ............. sparse_attn[NO] ............ .......[NO] .......[OKAY] -[OKAY] -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -sparse_attnstochastic_transformer ............ .[NO] [NO]....... .......[OKAY] [OKAY] - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -transformer ............ [NO] ....... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -stochastic_transformer . [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version .................... 1.8.1 -.................... 1.8.1torch cuda version -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - ............... torch cuda version11.1 -............... nvcc version11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -..................... nvcc version11.2 -..................... deepspeed install path11.2 -nvcc version ..................... 11.2 -........... deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] ................... 0.4.2+bc17042, bc17042, big-science - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed infodeepspeed wheel compiled w. ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] .......transformer_inference [NO].. - [NO] ....... [OKAY] -transformer_inferenceutils .................... [YES][NO] ............. [OKAY][OKAY] - -quantizer ..............utils [NO].................. .......[YES] [OKAY]...... - [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version ............... 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1torch install path - torch cuda version .............................. 11.1 -nvcc version ..................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.2 - -deepspeed install path torch version........... .................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']1.8.1 - -deepspeed info torch cuda version................... ...............0.4.2+bc17042, bc17042, big-science -11.1deepspeed wheel compiled w. - nvcc version...... .....................torch 1.8, cuda 11.1 11.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -utils .................. [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - ....................torch cuda version 1.8.1............... - 11.1torch cuda version - nvcc version............... .....................11.1 -11.2 -nvcc version deepspeed install path..................... ...........11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - ...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info - deepspeed wheel compiled w.................... ......0.4.2+bc17042, bc17042, big-science -torch 1.8, cuda 11.1deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -ninja .................. [OKAY] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -DeepSpeed general environment info: -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:torch version -op name ................ installed .. compatible -torch version .................... 1.8.1 -.................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -stochastic_transformer . [NO] ....... [OKAY] -torch cuda versiontorch install path ............... ...............11.1 -nvcc version ..................... 11.2 -cpu_adam ............... [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed install path - ........... torch version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -fused_adam ............. [NO] ....... [OKAY] -.................... 1.8.1 -torch cuda version ............... 11.1deepspeed info -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - nvcc version................... ..................... 11.2 -deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science...... -transformer ............ [NO] ....... [OKAY] - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninjacpu_adam ................................. [OKAY][YES] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: - ......-------------------------------------------------- -[OKAY]op name -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - ................ installed .. compatible -torch cuda version ............... 11.1 --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -cpu_adam fused_lamb............... [YES]............. ...... [NO][OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_adam ............. sparse_attn[NO] ................... [OKAY][NO] - ....... [OKAY]fused_lamb - ............. [NO]transformer ....... ............[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -[NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] sparse_attn....... ............[OKAY] -[NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -async_io ............... [NO] ....... [NO] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer_inference .. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -utils .................. [YES] ...... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch install pathtorch cuda version .............................. 11.1 -nvcc version ..................... 11.2['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed install path ...........torch version ....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -1.8.1deepspeed info - ...................torch cuda version 0.4.2+bc17042, bc17042, big-science............... - deepspeed wheel compiled w.11.1 -......nvcc version torch 1.8, cuda 11.1..................... - 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -JIT compiled ops requires ninja -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference utils.. [NO] ....... [OKAY] -.................. [YES] ...... utils[OKAY] -.................. [YES] ......quantizer [OKAY].............. - [NO] ....... [OKAY]quantizer - .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninja .................. [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io async_io............... [NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO]utils ......................... [OKAY][YES] - ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES]ninja ...... ..................[OKAY] -[OKAY] --------------------------------------------------- -op name ................ installed ..fused_adam compatible............. - --------------------------------------------------[NO] - ....... [OKAY] -cpu_adamfused_lamb ............... .............[YES] [NO]...... .......[OKAY] - [OKAY] -fused_adamsparse_attn ......................... [NO] [NO]....... .......[OKAY] - [OKAY] -fused_lambtransformer ......................... [NO] [NO]....... .......[OKAY] - [OKAY] -stochastic_transformer . [NO] ....... sparse_attn ............ [OKAY][NO] -....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY]quantizer - .............. [NO] .......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ......quantizer [OKAY].............. - [NO] .......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] -ninja-------------------------------------------------- -.................. op name[OKAY] -................ installed-------------------------------------------------- -.. op namecompatible -................-------------------------------------------------- -installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] .......fused_adam [OKAY]............. - [NO] ....... fused_lamb[OKAY] -............. [NO] fused_lamb....... .............[OKAY] -[NO] ....... [OKAY] -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] -[NO] .......transformer [OKAY]............ - [NO] ....... transformer[OKAY] -............ [NO] ....... [OKAY]stochastic_transformer - .stochastic_transformer [NO] ........ [OKAY][NO] - ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed general environment info: -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -stochastic_transformer . [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -async_io ............... [NO] ....... [NO] -nvcc versionnvcc version .......................................... 11.211.2 -transformer_inference .. [NO] ....... [OKAY] - -utils .................. [YES] ...... [OKAY] -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -quantizer .............. [NO] ....... [OKAY] -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -using world size: 256, data-parallel-size: 8, tensor-model-parallel size: 4, pipeline-model-parallel size: 8 -using torch.float16 for parameters ... ------------------------- arguments ------------------------ -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - accumulate_allreduce_grads_in_fp32 .............. False -transformer ............ [NO] ....... [OKAY] - adam_beta1 ...................................... 0.9 - adam_beta2 ...................................... 0.999 - adam_eps ........................................ 1e-08 - adlr_autoresume ................................. False - adlr_autoresume_interval ........................ 1000 - apply_query_key_layer_scaling ................... True -stochastic_transformer . [NO] ....... [OKAY] - apply_residual_connection_post_layernorm ........ False - attention_dropout ............................... 0.1 - attention_softmax_in_fp32 ....................... False - bert_binary_head ................................ True - bert_load ....................................... None - bf16 ............................................ False - bias_dropout_fusion ............................. True - bias_gelu_fusion ................................ True - biencoder_projection_dim ........................ 0 - biencoder_shared_query_context_model ............ False - block_data_path ................................. None - checkpoint_activations .......................... True - checkpoint_in_cpu ............................... False - checkpoint_num_layers ........................... 1 - clip_grad ....................................... 1.0 - codecarbon_dir .................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/codecarbon - consumed_train_samples .......................... 0 - consumed_valid_samples .......................... 0 - contigious_checkpointing ........................ False - cpu_optimizer ................................... False - cpu_torch_adam .................................. False - data_impl ....................................... mmap - data_parallel_size .............................. 8 - data_path ....................................... ['/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document'] - dataloader_type ................................. single - DDP_impl ........................................ local - decoder_seq_length .............................. None - deepscale ....................................... False - deepscale_config ................................ None - deepspeed ....................................... True - deepspeed_activation_checkpointing .............. True - deepspeed_config ................................ ./ds_config.1164492.json - deepspeed_mpi ................................... False - distribute_checkpointed_activations ............. False - distributed_backend ............................. nccl - embedding_path .................................. None - encoder_seq_length .............................. 2048 - eod_mask_loss ................................... False - eval_interval ................................... 1000 - eval_iters ...................................... 5 - evidence_data_path .............................. None - exit_duration_in_mins ........................... 1190 - exit_interval ................................... None - ffn_hidden_size ................................. 20480 - finetune ........................................ False - fp16 ............................................ True - fp16_lm_cross_entropy ........................... False - fp32_residual_connection ........................ False - global_batch_size ............................... 2048 - hidden_dropout .................................. 0.1 - hidden_size ..................................... 16384 - hysteresis ...................................... 2 - ict_head_size ................................... None - ict_load ........................................ None - img_dim ......................................... 224 - indexer_batch_size .............................. 128 - indexer_log_interval ............................ 1000 - init_method_std ................................. 0.02 - init_method_xavier_uniform ...................... False - initial_loss_scale .............................. 4294967296 - kv_channels ..................................... 512 - layernorm_epsilon ............................... 1e-05 - lazy_mpu_init ................................... None - load ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - local_rank ...................................... 0 - log_batch_size_to_tensorboard ................... True - log_interval .................................... 1 - log_learning_rate_to_tensorboard ................ True - log_loss_scale_to_tensorboard ................... True - log_num_zeros_in_grad ........................... False - log_params_norm ................................. False - log_timers_to_tensorboard ....................... True - log_validation_ppl_to_tensorboard ............... True - loss_scale ...................................... 12.0 - loss_scale_window ............................... 1000 - lr .............................................. 6e-05 - lr_decay_iters .................................. None - lr_decay_samples ................................ 126953125 - lr_decay_style .................................. cosine - lr_warmup_fraction .............................. None - lr_warmup_iters ................................. 0 - lr_warmup_samples ............................... 216320 - make_vocab_size_divisible_by .................... 128 - mask_prob ....................................... 0.15 - masked_softmax_fusion ........................... True - max_position_embeddings ......................... 2048 - memory_centric_tiled_linear ..................... False - merge_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt - micro_batch_size ................................ 1 - min_loss_scale .................................. 1.0 - min_lr .......................................... 6e-06 - mmap_warmup ..................................... False - no_load_optim ................................... None - no_load_rng ..................................... None - no_save_optim ................................... None - no_save_rng ..................................... None - num_attention_heads ............................. 32 - num_channels .................................... 3 - num_classes ..................................... 1000 - num_layers ...................................... 32 - num_layers_per_virtual_pipeline_stage ........... None - num_workers ..................................... 2 - onnx_safe ....................................... None - openai_gelu ..................................... False - optimizer ....................................... adam - override_lr_scheduler ........................... False - params_dtype .................................... torch.float16 - partition_activations ........................... False - patch_dim ....................................... 16 - pipeline_model_parallel_size .................... 8 - position_embedding_type ......................... PositionEmbeddingType.absolute - profile_backward ................................ False - query_in_block_prob ............................. 0.1 - rampup_batch_size ............................... ['16', '16', '6_000_000'] - rank ............................................ 0 - remote_device ................................... none - reset_attention_mask ............................ False - reset_position_ids .............................. False - retriever_report_topk_accuracies ................ [] - retriever_score_scaling ......................... False - retriever_seq_length ............................ 256 - sample_rate ..................................... 1.0 - save ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - save_interval ................................... 1500 - scatter_gather_tensors_in_pipeline .............. True - scattered_embeddings ............................ False - seed ............................................ 42 - seq_length ...................................... 2048 - sgd_momentum .................................... 0.9 - short_seq_prob .................................. 0.1 - split ........................................... 949,50,1 - split_transformers .............................. False - synchronize_each_layer .......................... False - tensor_model_parallel_size ...................... 4 - tensorboard_dir ................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/tensorboard - tensorboard_log_interval ........................ 1 - tensorboard_queue_size .......................... 5 - tile_factor ..................................... 1 - titles_data_path ................................ None - tokenizer_name_or_path .......................... None - tokenizer_type .................................. GPT2BPETokenizer - train_iters ..................................... None - train_samples ................................... 300000000 - use_checkpoint_lr_scheduler ..................... False - use_contiguous_buffers_in_ddp ................... False - use_cpu_initialization .......................... None - use_one_sent_docs ............................... False - use_pin_memory .................................. False - virtual_pipeline_model_parallel_size ............ None - vocab_extra_ids ................................. 0 - vocab_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json - weight_decay .................................... 0.1 - world_size ...................................... 256 - zero_allgather_bucket_size ...................... 0.0 - zero_contigious_gradients ....................... False - zero_reduce_bucket_size ......................... 0.0 - zero_reduce_scatter ............................. False - zero_stage ...................................... 1 --------------------- end of arguments --------------------- -will use batch size rampup starting from global batch size 16 to global batch size 2048 with batch size increments 16 over 6000000 samples. -> building GPT2BPETokenizer tokenizer ... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info:DeepSpeed general environment info: - -async_io ............... [NO] ....... [NO] -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -transformer_inference .. [NO] ....... [OKAY] -torch versiontorch version ........................................ 1.8.11.8.1 - -utils .................. [YES] ...... [OKAY] -torch cuda versiontorch cuda version .............................. 11.111.1 - -quantizer .............. [NO] ....... [OKAY] -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - --------------------------------------------------- -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] async_io....... [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. -/bin/sh: line 0: type: git: not found - [NO] ....... [OKAY]utils - .................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -.............. [NO] quantizer....... ..............[OKAY] -[NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> setting tensorboard ... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adam ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adam ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -fused_lamb ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path torch version............... .................... 1.8.1 -torch cuda version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... -11.1 -nvcc versiontorch version ......................................... 11.21.8.1 - -deepspeed install path torch cuda version........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1 - -deepspeed infonvcc version ........................................ 0.4.2+bc17042, bc17042, big-science11.2 - -deepspeed wheel compiled w.deepspeed install path ................. torch 1.8, cuda 11.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO]utils ......................... [OKAY][YES] - ...... [OKAY] -quantizerutils ................................ [NO][YES] ............. [OKAY][OKAY] - --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1 -torch cuda version ...............torch cuda version 11.1............... -nvcc version 11.1..................... - nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. ...... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch version .................... 1.8.1 -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version ............... 11.1 -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -DeepSpeed general environment info: -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1 -torch version ....................torch cuda version 1.8.1............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja ninja.................. [OKAY].................. - [OKAY]-------------------------------------------------- - ---------------------------------------------------op name - ................op name installed................ .. installedcompatible -.. --------------------------------------------------compatible - --------------------------------------------------- -cpu_adam ...............cpu_adam [YES] ..................... [YES][OKAY] -...... [OKAY] -ninja .................. [OKAY] -fused_adam-------------------------------------------------- -fused_adam.............op name [NO]............................. installed.......[NO] [OKAY]......... - compatible[OKAY] -fused_lamb - --------------------------------------------------............. - [NO]fused_lamb .................... [OKAY][NO] - cpu_adam.......ninja ............... [OKAY]..................[YES] - [OKAY]......sparse_attn [OKAY] - -............ --------------------------------------------------[NO] - .......op name sparse_attn[OKAY]................ fused_adam - installed......................... ..transformer[NO][NO] .......................... [OKAY][NO][OKAY]compatible - - -.......transformer --------------------------------------------------[OKAY]fused_lamb............ - .............[NO] - stochastic_transformer[NO]....... .......[OKAY] . -[OKAY] -[NO]cpu_adamstochastic_transformer ...................... . [OKAY] sparse_attn[YES] - [NO] .................. [NO][OKAY]....... [OKAY] - -....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adamstochastic_transformer .............. [NO][NO] .............. [OKAY] - [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ...............DeepSpeed general environment info: 11.1 -nvcc version - ..................... 11.2 -deepspeed install pathtorch install path ........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed wheel compiled w. - ...... torch versiontorch 1.8, cuda 11.1 -.................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja ..................ninja [OKAY] -..................-------------------------------------------------- -[OKAY]op name - ................-------------------------------------------------- installed - ..op name compatible -................-------------------------------------------------- -installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... cpu_adam[OKAY] - ............... [YES] ...... [OKAY] -DeepSpeed general environment info: -fused_adam ............. [NO] ....... [OKAY]fused_adam - ............. [NO]fused_lamb .................... [NO] [OKAY]....... - [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -fused_lamb ............. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -sparse_attn ............ [NO] ....... [OKAY] -nvcc version ..................... 11.2 -sparse_attn ............transformer [NO]............ .......[NO] .......[OKAY] -[OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer ............stochastic_transformer [NO] ........ [NO][OKAY] ....... - [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -stochastic_transformer . [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1torch cuda version - ...............torch cuda version 11.1............... - nvcc version11.1 -..................... nvcc version11.2 ..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] -ninja-------------------------------------------------- - ..................op name [OKAY]................ - --------------------------------------------------installed - ..op name compatible................ - installed-------------------------------------------------- -.. compatible --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam...... ...............[OKAY] -[YES] ...... [OKAY] -fused_adam ............. [NO]fused_adam .................... [OKAY][NO] - ....... fused_lamb[OKAY] -............. [NO]fused_lamb .................... [OKAY][NO] - ....... [OKAY] -sparse_attn ............ [NO]sparse_attn ................... [OKAY][NO] - ....... [OKAY]transformer - ............ transformer[NO] ................... [NO][OKAY] -....... [OKAY] -stochastic_transformer stochastic_transformer . [NO]. [NO]....... ....... [OKAY][OKAY] - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninjatransformer .............................. [OKAY][NO] - --------------------------------------------------....... - [OKAY] -op name ................ installed stochastic_transformer.. compatible. - --------------------------------------------------[NO] - ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path ............... - torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda versiontorch version ................................... 11.11.8.1 - -nvcc version .....................torch cuda version 11.2............... - deepspeed install path11.1 -...........nvcc version .....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.2deepspeed info - ...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...... - deepspeed infotorch 1.8, cuda 11.1 -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.DeepSpeed general environment info: ...... torch 1.8, cuda 11.1 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -async_io ............... [NO] ....... [NO] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils .................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1torch cuda version - ...............torch cuda version 11.1............... - 11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ...... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed general environment info: -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version ....................torch install path 1.8.1 - ...............torch cuda version ............... 11.1 -nvcc version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']..................... - 11.2 -torch versiondeepspeed install path ............................... 1.8.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda versiondeepspeed info .................................. 11.10.4.2+bc17042, bc17042, big-science - -nvcc versiondeepspeed wheel compiled w. ........................... 11.2torch 1.8, cuda 11.1 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... DeepSpeed general environment info:1.8.1 - -torch cuda version ............... 11.1 -torch install pathnvcc version ..................... ...............11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed info - ................... torch version0.4.2+bc17042, bc17042, big-science -....................deepspeed wheel compiled w. 1.8.1...... - torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -> setting codecarbon ... -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... ninja[YES] ...... ..................[OKAY] -[OKAY] --------------------------------------------------- -op name ................ installed .. fused_adamcompatible -............. --------------------------------------------------[NO] -....... [OKAY] -fused_lamb ............. cpu_adam[NO] ...................... [OKAY][YES] - ...... [OKAY] -sparse_attnfused_adam ......................... [NO][NO] ....... .......[OKAY] -[OKAY] -transformer ............fused_lamb [NO]............. ....... [NO][OKAY] -....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .. [NO] ......... [NO] [OKAY]....... - [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ......quantizer [OKAY].............. - [NO] ....... quantizer[OKAY] -.............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY] -utils quantizer.................. ..............[YES] [NO]...... .......[OKAY] -[OKAY] -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- -> initializing torch distributed ... -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -transformertransformer ........................ [NO][NO] .............. [OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils ..................transformer_inference [YES].. ......[NO] [OKAY]....... - [OKAY] -quantizer .............. [NO] utils....... [OKAY].................. - [YES] ...... --------------------------------------------------[OKAY] - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... async_io[NO] ....... ...............[NO] -[NO] ....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY]-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... utils[OKAY] -.................. [YES] ...... [OKAY] -utils .................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version torch cuda version.................... 1.8.1............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - - -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op nameninja ................ ..................installed [OKAY].. - compatible --------------------------------------------------- --------------------------------------------------- -ninjaop name .................................. installed[OKAY]cpu_adam - ................. -------------------------------------------------- -ninjacompatible[YES]op name - .................. ................ -------------------------------------------------- ......[OKAY]installed - - --------------------------------------------------[OKAY].. - - op namecompatible cpu_adam -................ -------------------------------------------------- -...............installed fused_adam..[YES] compatible................... -cpu_adam -------------------------------------------------- [OKAY][NO] -............... - [YES]....... ...... [OKAY]cpu_adam[OKAY] - -fused_adam............... [YES]fused_lamb ............. ................... [OKAY][NO][NO] -fused_adam ........................... [OKAY][OKAY][NO] - - fused_adam....... .............fused_lamb[OKAY] [NO] -............. ....... [NO][OKAY]fused_lamb -sparse_attn fused_lamb............. ....... ............[NO] ............. [NO] [OKAY].......[NO] ....... - .......[OKAY][OKAY] - -[OKAY] -transformersparse_attn ........................ [NO][NO] .......sparse_attn ....... [OKAY]sparse_attn ............ [OKAY] - ............ -[NO] [NO]....... stochastic_transformer .......transformer[OKAY] -[OKAY]............. - transformer[NO][NO] transformer ................... ....... ............[NO] [OKAY] [OKAY]....... -[NO] - [OKAY]....... - [OKAY]stochastic_transformer - stochastic_transformer .stochastic_transformer. [NO].[NO] .......[NO] ....... [OKAY] ....... - [OKAY][OKAY] - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... [NO]............... .......[NO] .......[NO] - [NO] -transformer_inferencetransformer_inference .. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -.............. [NO] quantizer....... ..............[OKAY] -[NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -DeepSpeed C++/CUDA extension op report----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. [OKAY][OKAY] .................. -[OKAY] --------------------------------------------------- - ---------------------------------------------------[OKAY]--------------------------------------------------op name - - -op name................-------------------------------------------------- -................op nameinstalled op name installed .................................... compatibleinstalledcompatibleinstalled -.. -------------------------------------------------- .. - - compatiblecompatible-------------------------------------------------- - - --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY]cpu_adamcpu_adam -cpu_adam .............................. [YES][YES] ............... ...... ...... [YES] [OKAY] [OKAY]fused_adam - - ................... [OKAY][NO] - ....... [OKAY] -fused_adamfused_adamfused_lamb ....................................... [NO][NO][NO] ..................... fused_adam [OKAY][OKAY] -[OKAY] - -.............fused_lambfused_lamb [NO].......................... [NO][NO] sparse_attn ................................. [OKAY][OKAY][OKAY][NO] - - ....... - [OKAY] -fused_lamb ............. transformer[NO] ................... [OKAY]sparse_attn[NO] -sparse_attn ............................... [NO][OKAY][NO] - .............. [OKAY][OKAY] - -stochastic_transformersparse_attn transformertransformer . ........................ [NO]............ [NO]....... [NO] ....... [NO][OKAY] ....... - [OKAY] -[OKAY] -stochastic_transformerstochastic_transformer ....... ..[OKAY] [NO][NO] ....... - .......[OKAY] transformer -[OKAY] -............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc version nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... [NO] ....... [NO] - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer_inference ..quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. [YES] quantizer...... ..............[OKAY] -[NO] ....... quantizer[OKAY] -.............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch install path.................... 1.8.1............... - torch cuda version ............... 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']nvcc version - ..................... 11.2torch version - ....................deepspeed install path 1.8.1........... - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch cuda version -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> initializing tensor model parallel with size 4 -> initializing pipeline model parallel with size 8 -> setting random seeds to 42 ... -[2021-09-24 05:52:24,592] [INFO] [checkpointing.py:226:model_parallel_cuda_manual_seed] > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 2760 and data parallel seed: 42 -> compiling dataset index builder ... -make: Entering directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' -make: Nothing to be done for 'default'. -make: Leaving directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' ->>> done with dataset index builder. Compilation time: 0.299 seconds -> compiling and loading fused kernels ... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_upper_triang_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_upper_triang_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module fused_mix_prec_layer_norm_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module fused_mix_prec_layer_norm_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( ->>> done with compiling and loading fused kernels. Compilation time: 19.795 seconds -time to initialize megatron (seconds): 12.702 -[after megatron is initialized] datetime: 2021-09-24 05:52:44 -building GPT model ... -[2021-09-24 05:52:44,769] [INFO] [utils.py:680:see_memory_usage] Before Building Model -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:373: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved - warnings.warn( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:381: FutureWarning: torch.cuda.max_memory_cached has been renamed to torch.cuda.max_memory_reserved - warnings.warn( -[2021-09-24 05:52:44,770] [INFO] [utils.py:681:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB -[2021-09-24 05:52:44,770] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 37.83 GB, percent = 20.2% -SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None -Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=1, data=0, model=0): 32, ProcessCoord(pipe=1, data=0, model=1): 33, ProcessCoord(pipe=1, data=0, model=2): 34, ProcessCoord(pipe=1, data=0, model=3): 35, ProcessCoord(pipe=1, data=1, model=0): 36, ProcessCoord(pipe=1, data=1, model=1): 37, ProcessCoord(pipe=1, data=1, model=2): 38, ProcessCoord(pipe=1, data=1, model=3): 39, ProcessCoord(pipe=1, data=2, model=0): 40, ProcessCoord(pipe=1, data=2, model=1): 41, ProcessCoord(pipe=1, data=2, model=2): 42, ProcessCoord(pipe=1, data=2, model=3): 43, ProcessCoord(pipe=1, data=3, model=0): 44, ProcessCoord(pipe=1, data=3, model=1): 45, ProcessCoord(pipe=1, data=3, model=2): 46, ProcessCoord(pipe=1, data=3, model=3): 47, ProcessCoord(pipe=1, data=4, model=0): 48, ProcessCoord(pipe=1, data=4, model=1): 49, ProcessCoord(pipe=1, data=4, model=2): 50, ProcessCoord(pipe=1, data=4, model=3): 51, ProcessCoord(pipe=1, data=5, model=0): 52, ProcessCoord(pipe=1, data=5, model=1): 53, ProcessCoord(pipe=1, data=5, model=2): 54, ProcessCoord(pipe=1, data=5, model=3): 55, ProcessCoord(pipe=1, data=6, model=0): 56, ProcessCoord(pipe=1, data=6, model=1): 57, ProcessCoord(pipe=1, data=6, model=2): 58, ProcessCoord(pipe=1, data=6, model=3): 59, ProcessCoord(pipe=1, data=7, model=0): 60, ProcessCoord(pipe=1, data=7, model=1): 61, ProcessCoord(pipe=1, data=7, model=2): 62, ProcessCoord(pipe=1, data=7, model=3): 63, ProcessCoord(pipe=2, data=0, model=0): 64, ProcessCoord(pipe=2, data=0, model=1): 65, ProcessCoord(pipe=2, data=0, model=2): 66, ProcessCoord(pipe=2, data=0, model=3): 67, ProcessCoord(pipe=2, data=1, model=0): 68, ProcessCoord(pipe=2, data=1, model=1): 69, ProcessCoord(pipe=2, data=1, model=2): 70, ProcessCoord(pipe=2, data=1, model=3): 71, ProcessCoord(pipe=2, data=2, model=0): 72, ProcessCoord(pipe=2, data=2, model=1): 73, ProcessCoord(pipe=2, data=2, model=2): 74, ProcessCoord(pipe=2, data=2, model=3): 75, ProcessCoord(pipe=2, data=3, model=0): 76, ProcessCoord(pipe=2, data=3, model=1): 77, ProcessCoord(pipe=2, data=3, model=2): 78, ProcessCoord(pipe=2, data=3, model=3): 79, ProcessCoord(pipe=2, data=4, model=0): 80, ProcessCoord(pipe=2, data=4, model=1): 81, ProcessCoord(pipe=2, data=4, model=2): 82, ProcessCoord(pipe=2, data=4, model=3): 83, ProcessCoord(pipe=2, data=5, model=0): 84, ProcessCoord(pipe=2, data=5, model=1): 85, ProcessCoord(pipe=2, data=5, model=2): 86, ProcessCoord(pipe=2, data=5, model=3): 87, ProcessCoord(pipe=2, data=6, model=0): 88, ProcessCoord(pipe=2, data=6, model=1): 89, ProcessCoord(pipe=2, data=6, model=2): 90, ProcessCoord(pipe=2, data=6, model=3): 91, ProcessCoord(pipe=2, data=7, model=0): 92, ProcessCoord(pipe=2, data=7, model=1): 93, ProcessCoord(pipe=2, data=7, model=2): 94, ProcessCoord(pipe=2, data=7, model=3): 95, ProcessCoord(pipe=3, data=0, model=0): 96, ProcessCoord(pipe=3, data=0, model=1): 97, ProcessCoord(pipe=3, data=0, model=2): 98, ProcessCoord(pipe=3, data=0, model=3): 99, ProcessCoord(pipe=3, data=1, model=0): 100, ProcessCoord(pipe=3, data=1, model=1): 101, ProcessCoord(pipe=3, data=1, model=2): 102, ProcessCoord(pipe=3, data=1, model=3): 103, ProcessCoord(pipe=3, data=2, model=0): 104, ProcessCoord(pipe=3, data=2, model=1): 105, ProcessCoord(pipe=3, data=2, model=2): 106, ProcessCoord(pipe=3, data=2, model=3): 107, ProcessCoord(pipe=3, data=3, model=0): 108, ProcessCoord(pipe=3, data=3, model=1): 109, ProcessCoord(pipe=3, data=3, model=2): 110, ProcessCoord(pipe=3, data=3, model=3): 111, ProcessCoord(pipe=3, data=4, model=0): 112, ProcessCoord(pipe=3, data=4, model=1): 113, ProcessCoord(pipe=3, data=4, model=2): 114, ProcessCoord(pipe=3, data=4, model=3): 115, ProcessCoord(pipe=3, data=5, model=0): 116, ProcessCoord(pipe=3, data=5, model=1): 117, ProcessCoord(pipe=3, data=5, model=2): 118, ProcessCoord(pipe=3, data=5, model=3): 119, ProcessCoord(pipe=3, data=6, model=0): 120, ProcessCoord(pipe=3, data=6, model=1): 121, ProcessCoord(pipe=3, data=6, model=2): 122, ProcessCoord(pipe=3, data=6, model=3): 123, ProcessCoord(pipe=3, data=7, model=0): 124, ProcessCoord(pipe=3, data=7, model=1): 125, ProcessCoord(pipe=3, data=7, model=2): 126, ProcessCoord(pipe=3, data=7, model=3): 127, ProcessCoord(pipe=4, data=0, model=0): 128, ProcessCoord(pipe=4, data=0, model=1): 129, ProcessCoord(pipe=4, data=0, model=2): 130, ProcessCoord(pipe=4, data=0, model=3): 131, ProcessCoord(pipe=4, data=1, model=0): 132, ProcessCoord(pipe=4, data=1, model=1): 133, ProcessCoord(pipe=4, data=1, model=2): 134, ProcessCoord(pipe=4, data=1, model=3): 135, ProcessCoord(pipe=4, data=2, model=0): 136, ProcessCoord(pipe=4, data=2, model=1): 137, ProcessCoord(pipe=4, data=2, model=2): 138, ProcessCoord(pipe=4, data=2, model=3): 139, ProcessCoord(pipe=4, data=3, model=0): 140, ProcessCoord(pipe=4, data=3, model=1): 141, ProcessCoord(pipe=4, data=3, model=2): 142, ProcessCoord(pipe=4, data=3, model=3): 143, ProcessCoord(pipe=4, data=4, model=0): 144, ProcessCoord(pipe=4, data=4, model=1): 145, ProcessCoord(pipe=4, data=4, model=2): 146, ProcessCoord(pipe=4, data=4, model=3): 147, ProcessCoord(pipe=4, data=5, model=0): 148, ProcessCoord(pipe=4, data=5, model=1): 149, ProcessCoord(pipe=4, data=5, model=2): 150, ProcessCoord(pipe=4, data=5, model=3): 151, ProcessCoord(pipe=4, data=6, model=0): 152, ProcessCoord(pipe=4, data=6, model=1): 153, ProcessCoord(pipe=4, data=6, model=2): 154, ProcessCoord(pipe=4, data=6, model=3): 155, ProcessCoord(pipe=4, data=7, model=0): 156, ProcessCoord(pipe=4, data=7, model=1): 157, ProcessCoord(pipe=4, data=7, model=2): 158, ProcessCoord(pipe=4, data=7, model=3): 159, ProcessCoord(pipe=5, data=0, model=0): 160, ProcessCoord(pipe=5, data=0, model=1): 161, ProcessCoord(pipe=5, data=0, model=2): 162, ProcessCoord(pipe=5, data=0, model=3): 163, ProcessCoord(pipe=5, data=1, model=0): 164, ProcessCoord(pipe=5, data=1, model=1): 165, ProcessCoord(pipe=5, data=1, model=2): 166, ProcessCoord(pipe=5, data=1, model=3): 167, ProcessCoord(pipe=5, data=2, model=0): 168, ProcessCoord(pipe=5, data=2, model=1): 169, ProcessCoord(pipe=5, data=2, model=2): 170, ProcessCoord(pipe=5, data=2, model=3): 171, ProcessCoord(pipe=5, data=3, model=0): 172, ProcessCoord(pipe=5, data=3, model=1): 173, ProcessCoord(pipe=5, data=3, model=2): 174, ProcessCoord(pipe=5, data=3, model=3): 175, ProcessCoord(pipe=5, data=4, model=0): 176, ProcessCoord(pipe=5, data=4, model=1): 177, ProcessCoord(pipe=5, data=4, model=2): 178, ProcessCoord(pipe=5, data=4, model=3): 179, ProcessCoord(pipe=5, data=5, model=0): 180, ProcessCoord(pipe=5, data=5, model=1): 181, ProcessCoord(pipe=5, data=5, model=2): 182, ProcessCoord(pipe=5, data=5, model=3): 183, ProcessCoord(pipe=5, data=6, model=0): 184, ProcessCoord(pipe=5, data=6, model=1): 185, ProcessCoord(pipe=5, data=6, model=2): 186, ProcessCoord(pipe=5, data=6, model=3): 187, ProcessCoord(pipe=5, data=7, model=0): 188, ProcessCoord(pipe=5, data=7, model=1): 189, ProcessCoord(pipe=5, data=7, model=2): 190, ProcessCoord(pipe=5, data=7, model=3): 191, ProcessCoord(pipe=6, data=0, model=0): 192, ProcessCoord(pipe=6, data=0, model=1): 193, ProcessCoord(pipe=6, data=0, model=2): 194, ProcessCoord(pipe=6, data=0, model=3): 195, ProcessCoord(pipe=6, data=1, model=0): 196, ProcessCoord(pipe=6, data=1, model=1): 197, ProcessCoord(pipe=6, data=1, model=2): 198, ProcessCoord(pipe=6, data=1, model=3): 199, ProcessCoord(pipe=6, data=2, model=0): 200, ProcessCoord(pipe=6, data=2, model=1): 201, ProcessCoord(pipe=6, data=2, model=2): 202, ProcessCoord(pipe=6, data=2, model=3): 203, ProcessCoord(pipe=6, data=3, model=0): 204, ProcessCoord(pipe=6, data=3, model=1): 205, ProcessCoord(pipe=6, data=3, model=2): 206, ProcessCoord(pipe=6, data=3, model=3): 207, ProcessCoord(pipe=6, data=4, model=0): 208, ProcessCoord(pipe=6, data=4, model=1): 209, ProcessCoord(pipe=6, data=4, model=2): 210, ProcessCoord(pipe=6, data=4, model=3): 211, ProcessCoord(pipe=6, data=5, model=0): 212, ProcessCoord(pipe=6, data=5, model=1): 213, ProcessCoord(pipe=6, data=5, model=2): 214, ProcessCoord(pipe=6, data=5, model=3): 215, ProcessCoord(pipe=6, data=6, model=0): 216, ProcessCoord(pipe=6, data=6, model=1): 217, ProcessCoord(pipe=6, data=6, model=2): 218, ProcessCoord(pipe=6, data=6, model=3): 219, ProcessCoord(pipe=6, data=7, model=0): 220, ProcessCoord(pipe=6, data=7, model=1): 221, ProcessCoord(pipe=6, data=7, model=2): 222, ProcessCoord(pipe=6, data=7, model=3): 223, ProcessCoord(pipe=7, data=0, model=0): 224, ProcessCoord(pipe=7, data=0, model=1): 225, ProcessCoord(pipe=7, data=0, model=2): 226, ProcessCoord(pipe=7, data=0, model=3): 227, ProcessCoord(pipe=7, data=1, model=0): 228, ProcessCoord(pipe=7, data=1, model=1): 229, ProcessCoord(pipe=7, data=1, model=2): 230, ProcessCoord(pipe=7, data=1, model=3): 231, ProcessCoord(pipe=7, data=2, model=0): 232, ProcessCoord(pipe=7, data=2, model=1): 233, ProcessCoord(pipe=7, data=2, model=2): 234, ProcessCoord(pipe=7, data=2, model=3): 235, ProcessCoord(pipe=7, data=3, model=0): 236, ProcessCoord(pipe=7, data=3, model=1): 237, ProcessCoord(pipe=7, data=3, model=2): 238, ProcessCoord(pipe=7, data=3, model=3): 239, ProcessCoord(pipe=7, data=4, model=0): 240, ProcessCoord(pipe=7, data=4, model=1): 241, ProcessCoord(pipe=7, data=4, model=2): 242, ProcessCoord(pipe=7, data=4, model=3): 243, ProcessCoord(pipe=7, data=5, model=0): 244, ProcessCoord(pipe=7, data=5, model=1): 245, ProcessCoord(pipe=7, data=5, model=2): 246, ProcessCoord(pipe=7, data=5, model=3): 247, ProcessCoord(pipe=7, data=6, model=0): 248, ProcessCoord(pipe=7, data=6, model=1): 249, ProcessCoord(pipe=7, data=6, model=2): 250, ProcessCoord(pipe=7, data=6, model=3): 251, ProcessCoord(pipe=7, data=7, model=0): 252, ProcessCoord(pipe=7, data=7, model=1): 253, ProcessCoord(pipe=7, data=7, model=2): 254, ProcessCoord(pipe=7, data=7, model=3): 255} -[2021-09-24 05:52:46,176] [INFO] [module.py:360:_partition_layers] Partitioning pipeline stages with method type:transformer -stage=0 layers=7 - 0: _to_float16 - 1: EmbeddingPipe - 2: - 3: ParallelTransformerLayerPipe - 4: ParallelTransformerLayerPipe - 5: ParallelTransformerLayerPipe - 6: ParallelTransformerLayerPipe -stage=1 layers=4 - 7: ParallelTransformerLayerPipe - 8: ParallelTransformerLayerPipe - 9: ParallelTransformerLayerPipe - 10: ParallelTransformerLayerPipe -stage=2 layers=4 - 11: ParallelTransformerLayerPipe - 12: ParallelTransformerLayerPipe - 13: ParallelTransformerLayerPipe - 14: ParallelTransformerLayerPipe -stage=3 layers=4 - 15: ParallelTransformerLayerPipe - 16: ParallelTransformerLayerPipe - 17: ParallelTransformerLayerPipe - 18: ParallelTransformerLayerPipe -stage=4 layers=4 - 19: ParallelTransformerLayerPipe - 20: ParallelTransformerLayerPipe - 21: ParallelTransformerLayerPipe - 22: ParallelTransformerLayerPipe -stage=5 layers=4 - 23: ParallelTransformerLayerPipe - 24: ParallelTransformerLayerPipe - 25: ParallelTransformerLayerPipe - 26: ParallelTransformerLayerPipe -stage=6 layers=4 - 27: ParallelTransformerLayerPipe - 28: ParallelTransformerLayerPipe - 29: ParallelTransformerLayerPipe - 30: ParallelTransformerLayerPipe -stage=7 layers=8 - 31: ParallelTransformerLayerPipe - 32: ParallelTransformerLayerPipe - 33: ParallelTransformerLayerPipe - 34: ParallelTransformerLayerPipe - 35: - 36: MixedFusedLayerNorm - 37: EmbeddingPipe - 38: float16_to_fp32 - loss: CrossEntropy - > number of parameters on (tensor, pipeline) model parallel rank (2, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (2, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (1, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (1, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (3, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (2, 7): 1986498560 -[2021-09-24 05:52:47,386] [INFO] [utils.py:680:see_memory_usage] After Building Model -[2021-09-24 05:52:47,387] [INFO] [utils.py:681:see_memory_usage] MA 3.77 GB Max_MA 3.79 GB CA 3.79 GB Max_CA 4 GB -[2021-09-24 05:52:47,388] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 38.02 GB, percent = 20.3% - > number of parameters on (tensor, pipeline) model parallel rank (0, 0): 1986465792 -setting training iterations to 159576 -> learning rate decay style: cosine -DeepSpeed is enabled. -[2021-09-24 05:52:47,464] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.4.2+bc17042, git-hash=bc17042, git-branch=big-science -[2021-09-24 05:52:47,544] [INFO] [engine.py:179:__init__] DeepSpeed Flops Profiler Enabled: False -[2021-09-24 05:52:47,544] [INFO] [engine.py:736:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer -[2021-09-24 05:52:47,544] [INFO] [engine.py:741:_configure_optimizer] Using client Optimizer as basic optimizer -[2021-09-24 05:52:47,545] [INFO] [engine.py:750:_configure_optimizer] DeepSpeed Basic Optimizer = FusedAdam -[2021-09-24 05:52:47,545] [INFO] [utils.py:43:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= -[2021-09-24 05:52:47,545] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 1 optimizer -[2021-09-24 05:52:47,545] [INFO] [stage2.py:106:__init__] Reduce bucket size 500000000 -[2021-09-24 05:52:47,545] [INFO] [stage2.py:107:__init__] Allgather bucket size 500000000 -[2021-09-24 05:52:47,545] [INFO] [stage2.py:108:__init__] CPU Offload: False -[2021-09-24 05:52:47,545] [INFO] [stage2.py:109:__init__] Round robin gradient partitioning: False -[2021-09-24 05:52:52,071] [INFO] [stage2.py:419:__init__] optimizer state initialized -[2021-09-24 05:52:52,071] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam -[2021-09-24 05:52:52,071] [INFO] [engine.py:553:_configure_lr_scheduler] DeepSpeed using client LR scheduler -[2021-09-24 05:52:52,071] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = -[2021-09-24 05:52:52,072] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999)] -[2021-09-24 05:52:52,072] [INFO] [config.py:900:print] DeepSpeedEngine configuration: -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] activation_checkpointing_config { - "partition_activations": false, - "contiguous_memory_optimization": false, - "cpu_checkpointing": false, - "number_checkpoints": null, - "synchronize_checkpoint_boundary": false, - "profile": false -} -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] allreduce_always_fp32 ........ False -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] amp_enabled .................. False -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] amp_params ................... False -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] checkpoint_tag_validation_enabled True -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] checkpoint_tag_validation_fail False -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] disable_allgather ............ False -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] dump_state ................... False -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 500, 'delayed_shift': 2, 'min_scale': 1} -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] eigenvalue_enabled ........... False -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] eigenvalue_gas_boundary_resolution 1 -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] eigenvalue_layer_name ........ bert.encoder.layer -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] eigenvalue_layer_num ......... 0 -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] eigenvalue_max_iter .......... 100 -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] eigenvalue_stability ......... 1e-06 -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] eigenvalue_tol ............... 0.01 -[2021-09-24 05:52:52,072] [INFO] [config.py:904:print] eigenvalue_verbose ........... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] elasticity_enabled ........... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] flops_profiler_config ........ { - "enabled": false, - "profile_step": 1, - "module_depth": -1, - "top_modules": 1, - "detailed": true, - "output_file": null -} -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] fp16_enabled ................. True -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] fp16_mixed_quantize .......... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] global_rank .................. 0 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] gradient_accumulation_steps .. 256 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] gradient_clipping ............ 1.0 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] gradient_predivide_factor .... 1.0 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] initial_dynamic_scale ........ 4096 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] loss_scale ................... 0 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] memory_breakdown ............. False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] optimizer_legacy_fusion ...... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] optimizer_name ............... None -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] optimizer_params ............. None -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] pld_enabled .................. False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] pld_params ................... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] prescale_gradients ........... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_change_rate ......... 0.001 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_groups .............. 1 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_offset .............. 1000 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_period .............. 1000 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_rounding ............ 0 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_start_bits .......... 16 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_target_bits ......... 8 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_training_enabled .... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_type ................ 0 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] quantize_verbose ............. False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] scheduler_name ............... None -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] scheduler_params ............. None -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] sparse_attention ............. None -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] sparse_gradients_enabled ..... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] steps_per_print .............. 2000 -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] tensorboard_enabled .......... False -[2021-09-24 05:52:52,073] [INFO] [config.py:904:print] tensorboard_job_name ......... DeepSpeedJobName -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] tensorboard_output_path ...... -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] train_batch_size ............. 2048 -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] train_micro_batch_size_per_gpu 1 -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] use_quantizer_kernel ......... False -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] wall_clock_breakdown ......... False -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] world_size ................... 8 -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] zero_allow_untested_optimizer False -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] zero_config .................. { - "stage": 1, - "contiguous_gradients": false, - "reduce_scatter": true, - "reduce_bucket_size": 5.000000e+08, - "allgather_partitions": true, - "allgather_bucket_size": 5.000000e+08, - "overlap_comm": false, - "load_from_fp32_weights": true, - "elastic_checkpoint": true, - "offload_param": null, - "offload_optimizer": null, - "sub_group_size": 1.000000e+09, - "prefetch_bucket_size": 5.000000e+07, - "param_persistence_threshold": 1.000000e+05, - "max_live_parameters": 1.000000e+09, - "max_reuse_distance": 1.000000e+09, - "gather_fp16_weights_on_model_save": false, - "ignore_unused_parameters": true, - "round_robin_gradients": false, - "legacy_stage1": false -} -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] zero_enabled ................. True -[2021-09-24 05:52:52,074] [INFO] [config.py:904:print] zero_optimization_stage ...... 1 -[2021-09-24 05:52:52,074] [INFO] [config.py:906:print] json = { - "train_micro_batch_size_per_gpu": 1, - "train_batch_size": 2.048000e+03, - "gradient_clipping": 1.0, - "zero_optimization": { - "stage": 1 - }, - "fp16": { - "enabled": true, - "loss_scale": 0, - "loss_scale_window": 500, - "hysteresis": 2, - "min_loss_scale": 1, - "initial_scale_power": 12 - }, - "steps_per_print": 2.000000e+03, - "wall_clock_breakdown": false -} -[2021-09-24 05:52:52,074] [INFO] [engine.py:76:__init__] CONFIG: micro_batches=256 micro_batch_size=1 -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=0 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=3 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=1 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=2 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=64 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=66 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=65 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=67 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=195 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=193 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=192 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=194 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=130 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=129 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=128 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=131 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=97 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=96 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=98 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=32 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=35 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=34 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=33 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=160 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=161 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=224 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=227 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=226 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=225 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=99 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=163 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-24 05:52:52,378] [INFO] [engine.py:134:__init__] RANK=162 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) - > using checkpoint value 6e-05 for learning rate - > using checkpoint value 6e-06 for minimum learning rate - > using checkpoint value 216320 for warmup iterations - > using checkpoint value 126953125 for total number of iterations - > using checkpoint value cosine for decay style -successfully loaded 8 ZeRO state_dicts for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 171 -successfully loaded 8 ZeRO state_dicts for rank 176 -successfully loaded 8 ZeRO state_dicts for rank 88 -successfully loaded 8 ZeRO state_dicts for rank 170 -successfully loaded 8 ZeRO state_dicts for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 169 -successfully loaded 8 ZeRO state_dicts for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 49 -successfully loaded 8 ZeRO state_dicts for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 167 -successfully loaded 8 ZeRO state_dicts for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 99 -successfully loaded 8 ZeRO state_dicts for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 68 -successfully loaded 8 ZeRO state_dicts for rank 120 -loading 8 zero partition checkpoints for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 210 -successfully loaded 8 ZeRO state_dicts for rank 69 -successfully loaded 8 ZeRO state_dicts for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 129 -successfully loaded 8 ZeRO state_dicts for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 209 -successfully loaded 8 ZeRO state_dicts for rank 145 -successfully loaded 8 ZeRO state_dicts for rank 111 -successfully loaded 8 ZeRO state_dicts for rank 211 -successfully loaded 8 ZeRO state_dicts for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 141 -successfully loaded 8 ZeRO state_dicts for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 80 -successfully loaded 8 ZeRO state_dicts for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 106 -successfully loaded 8 ZeRO state_dicts for rank 187 -successfully loaded 8 ZeRO state_dicts for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 133 -successfully loaded 8 ZeRO state_dicts for rank 90 -successfully loaded 8 ZeRO state_dicts for rank 74 -successfully loaded 8 ZeRO state_dicts for rank 34 -successfully loaded 8 ZeRO state_dicts for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 200 -successfully loaded 8 ZeRO state_dicts for rank 122 -successfully loaded 8 ZeRO state_dicts for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 228 -successfully loaded 8 ZeRO state_dicts for rank 81 -successfully loaded 8 ZeRO state_dicts for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 163 -successfully loaded 8 ZeRO state_dicts for rank 64 -successfully loaded 8 ZeRO state_dicts for rank 186 -successfully loaded 8 ZeRO state_dicts for rank 97 -successfully loaded 8 ZeRO state_dicts for rank 70 -successfully loaded 8 ZeRO state_dicts for rank 51 -successfully loaded 8 ZeRO state_dicts for rank 77 -successfully loaded 8 ZeRO state_dicts for rank 160 -successfully loaded 8 ZeRO state_dicts for rank 50 -successfully loaded 8 ZeRO state_dicts for rank 202 -successfully loaded 8 ZeRO state_dicts for rank 98 -successfully loaded 8 ZeRO state_dicts for rank 20 -successfully loaded 8 ZeRO state_dicts for rank 85 -successfully loaded 8 ZeRO state_dicts for rank 89 -successfully loaded 8 ZeRO state_dicts for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 114 -successfully loaded 8 ZeRO state_dicts for rank 149 -successfully loaded 8 ZeRO state_dicts for rank 123 -successfully loaded 8 ZeRO state_dicts for rank 71 -successfully loaded 8 ZeRO state_dicts for rank 126 -successfully loaded 8 ZeRO state_dicts for rank 152 -successfully loaded 8 ZeRO state_dicts for rank 203 -successfully loaded 8 ZeRO state_dicts for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 41 -successfully loaded 8 ZeRO state_dicts for rank 222 -successfully loaded 8 ZeRO state_dicts for rank 130 -successfully loaded 8 ZeRO state_dicts for rank 216 -successfully loaded 8 ZeRO state_dicts for rank 84 -successfully loaded 8 ZeRO state_dicts for rank 100 -successfully loaded 8 ZeRO state_dicts for rank 42 -successfully loaded 8 ZeRO state_dicts for rank 190 -successfully loaded 8 ZeRO state_dicts for rank 12 -successfully loaded 8 ZeRO state_dicts for rank 44 -successfully loaded 8 ZeRO state_dicts for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 219 -successfully loaded 8 ZeRO state_dicts for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 37 -successfully loaded 8 ZeRO state_dicts for rank 33 -successfully loaded 8 ZeRO state_dicts for rank 56 -successfully loaded 8 ZeRO state_dicts for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 115 -successfully loaded 8 ZeRO state_dicts for rank 24 -successfully loaded 8 ZeRO state_dicts for rank 45 -successfully loaded 8 ZeRO state_dicts for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 153 -successfully loaded 8 ZeRO state_dicts for rank 134 -successfully loaded 8 ZeRO state_dicts for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 38 -successfully loaded 8 ZeRO state_dicts for rank 131 -successfully loaded 8 ZeRO state_dicts for rank 121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-24 05:53:20 CEST)" was missed by 0:00:03.058626 -successfully loaded 8 ZeRO state_dicts for rank 217 -successfully loaded 8 ZeRO state_dicts for rank 146 -successfully loaded 8 ZeRO state_dicts for rank 195 -successfully loaded 8 ZeRO state_dicts for rank 82 -successfully loaded 8 ZeRO state_dicts for rank 191 -successfully loaded 8 ZeRO state_dicts for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 208 -loading 8 zero partition checkpoints for rank 176 -successfully loaded 8 ZeRO state_dicts for rank 65 -successfully loaded 8 ZeRO state_dicts for rank 78 -successfully loaded 8 ZeRO state_dicts for rank 93 -successfully loaded 8 ZeRO state_dicts for rank 188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-24 05:53:20 CEST)" was missed by 0:00:03.434951 -successfully loaded 8 ZeRO state_dicts for rank 162 -successfully loaded 8 ZeRO state_dicts for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 221 -successfully loaded 8 ZeRO state_dicts for rank 107 -successfully loaded 8 ZeRO state_dicts for rank 179 -successfully loaded 8 ZeRO state_dicts for rank 147 -successfully loaded 8 ZeRO state_dicts for rank 36 -loading 8 zero partition checkpoints for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 116 -successfully loaded 8 ZeRO state_dicts for rank 199 -loading 8 zero partition checkpoints for rank 88 -loading 8 zero partition checkpoints for rank 170 -successfully loaded 8 ZeRO state_dicts for rank 151 -successfully loaded 8 ZeRO state_dicts for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 35 -successfully loaded 8 ZeRO state_dicts for rank 223 -successfully loaded 8 ZeRO state_dicts for rank 175 -successfully loaded 8 ZeRO state_dicts for rank 13 -successfully loaded 8 ZeRO state_dicts for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 218 -successfully loaded 8 ZeRO state_dicts for rank 213 -successfully loaded 8 ZeRO state_dicts for rank 119 -successfully loaded 8 ZeRO state_dicts for rank 198 -successfully loaded 8 ZeRO state_dicts for rank 164 -loading 8 zero partition checkpoints for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 109 -successfully loaded 8 ZeRO state_dicts for rank 197 -successfully loaded 8 ZeRO state_dicts for rank 66 -successfully loaded 8 ZeRO state_dicts for rank 22 -successfully loaded 8 ZeRO state_dicts for rank 185 -successfully loaded 8 ZeRO state_dicts for rank 196 -successfully loaded 8 ZeRO state_dicts for rank 43 -successfully loaded 8 ZeRO state_dicts for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 205 -successfully loaded 8 ZeRO state_dicts for rank 181 -successfully loaded 8 ZeRO state_dicts for rank 25 -successfully loaded 8 ZeRO state_dicts for rank 91 -successfully loaded 8 ZeRO state_dicts for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 173 -successfully loaded 8 ZeRO state_dicts for rank 39 -successfully loaded 8 ZeRO state_dicts for rank 161 -successfully loaded 8 ZeRO state_dicts for rank 29 -successfully loaded 8 ZeRO state_dicts for rank 26 -successfully loaded 8 ZeRO state_dicts for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 28 -successfully loaded 8 ZeRO state_dicts for rank 87 -successfully loaded 8 ZeRO state_dicts for rank 53 -successfully loaded 8 ZeRO state_dicts for rank 194 -successfully loaded 8 ZeRO state_dicts for rank 54 -successfully loaded 8 ZeRO state_dicts for rank 73 -successfully loaded 8 ZeRO state_dicts for rank 21 -successfully loaded 8 ZeRO state_dicts for rank 27 -successfully loaded 8 ZeRO state_dicts for rank 46 -successfully loaded 8 ZeRO state_dicts for rank 67 -loading 8 zero partition checkpoints for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 165 -successfully loaded 8 ZeRO state_dicts for rank 118 -successfully loaded 8 ZeRO state_dicts for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 57 -successfully loaded 8 ZeRO state_dicts for rank 75 -successfully loaded 8 ZeRO state_dicts for rank 0 -successfully loaded 8 ZeRO state_dicts for rank 92 -loading 8 zero partition checkpoints for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 94 -successfully loaded 8 ZeRO state_dicts for rank 55 -successfully loaded 8 ZeRO state_dicts for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 83 -successfully loaded 8 ZeRO state_dicts for rank 6 -successfully loaded 8 ZeRO state_dicts for rank 86 -successfully loaded 8 ZeRO state_dicts for rank 189 -successfully loaded 8 ZeRO state_dicts for rank 5 -successfully loaded 8 ZeRO state_dicts for rank 117 -successfully loaded 8 ZeRO state_dicts for rank 4 -successfully loaded 8 ZeRO state_dicts for rank 30 -successfully loaded 8 ZeRO state_dicts for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 1 -successfully loaded 8 ZeRO state_dicts for rank 110 -successfully loaded 8 ZeRO state_dicts for rank 58 -successfully loaded 8 ZeRO state_dicts for rank 79 -successfully loaded 8 ZeRO state_dicts for rank 101 -successfully loaded 8 ZeRO state_dicts for rank 177 -successfully loaded 8 ZeRO state_dicts for rank 2 -loading 8 zero partition checkpoints for rank 167 -successfully loaded 8 ZeRO state_dicts for rank 95 -successfully loaded 8 ZeRO state_dicts for rank 227 -loading 8 zero partition checkpoints for rank 171 -successfully loaded 8 ZeRO state_dicts for rank 103 -successfully loaded 8 ZeRO state_dicts for rank 142 -loading 8 zero partition checkpoints for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 10 -loading 8 zero partition checkpoints for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 31 -successfully loaded 8 ZeRO state_dicts for rank 178 -successfully loaded 8 ZeRO state_dicts for rank 3 -successfully loaded 8 ZeRO state_dicts for rank 154 -successfully loaded 8 ZeRO state_dicts for rank 47 -successfully loaded 8 ZeRO state_dicts for rank 59 -successfully loaded 8 ZeRO state_dicts for rank 23 -successfully loaded 8 ZeRO state_dicts for rank 15 -loading 8 zero partition checkpoints for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 182 -successfully loaded 8 ZeRO state_dicts for rank 14 -successfully loaded 8 ZeRO state_dicts for rank 252 -successfully loaded 8 ZeRO state_dicts for rank 236 -successfully loaded 8 ZeRO state_dicts for rank 224 -successfully loaded 8 ZeRO state_dicts for rank 183 -loading 8 zero partition checkpoints for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 138 -loading 8 zero partition checkpoints for rank 99 -successfully loaded 8 ZeRO state_dicts for rank 230 -loading 8 zero partition checkpoints for rank 120 -successfully loaded 8 ZeRO state_dicts for rank 238 -loading 8 zero partition checkpoints for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 226 -successfully loaded 8 ZeRO state_dicts for rank 8 -successfully loaded 8 ZeRO state_dicts for rank 231 -successfully loaded 8 ZeRO state_dicts for rank 243 -successfully loaded 8 ZeRO state_dicts for rank 246 -successfully loaded 8 ZeRO state_dicts for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 239 -successfully loaded 8 ZeRO state_dicts for rank 250 -loading 8 zero partition checkpoints for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 242 -successfully loaded 8 ZeRO state_dicts for rank 234 -loading 8 zero partition checkpoints for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 240 -loading 8 zero partition checkpoints for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 254 -loading 8 zero partition checkpoints for rank 169 -successfully loaded 8 ZeRO state_dicts for rank 244 -successfully loaded 8 ZeRO state_dicts for rank 9 -loading 8 zero partition checkpoints for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 7 -successfully loaded 8 ZeRO state_dicts for rank 241 -loading 8 zero partition checkpoints for rank 69 -successfully loaded 8 ZeRO state_dicts for rank 237 -successfully loaded 8 ZeRO state_dicts for rank 174 -loading 8 zero partition checkpoints for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 229 -successfully loaded 8 ZeRO state_dicts for rank 248 -successfully loaded 8 ZeRO state_dicts for rank 235 -successfully loaded 8 ZeRO state_dicts for rank 253 -loading 8 zero partition checkpoints for rank 209 -loading 8 zero partition checkpoints for rank 40 -loading 8 zero partition checkpoints for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 225 -loading 8 zero partition checkpoints for rank 80 -successfully loaded 8 ZeRO state_dicts for rank 232 -successfully loaded 8 ZeRO state_dicts for rank 255 -successfully loaded 8 ZeRO state_dicts for rank 247 -loading 8 zero partition checkpoints for rank 90 -loading 8 zero partition checkpoints for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 251 -successfully loaded 8 ZeRO state_dicts for rank 233 -loading 8 zero partition checkpoints for rank 125 -loading 8 zero partition checkpoints for rank 34 -loading 8 zero partition checkpoints for rank 106 -successfully loaded 8 ZeRO state_dicts for rank 245 -loading 8 zero partition checkpoints for rank 137 -loading 8 zero partition checkpoints for rank 81 -successfully loaded 8 ZeRO state_dicts for rank 102 -loading 8 zero partition checkpoints for rank 187 -loading 8 zero partition checkpoints for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 249 -loading 8 zero partition checkpoints for rank 186 -loading 8 zero partition checkpoints for rank 105 -loading 8 zero partition checkpoints for rank 64 -loading 8 zero partition checkpoints for rank 74 -loading 8 zero partition checkpoints for rank 160 -loading 8 zero partition checkpoints for rank 216 -loading 8 zero partition checkpoints for rank 77 -loading 8 zero partition checkpoints for rank 139 -loading 8 zero partition checkpoints for rank 149 -loading 8 zero partition checkpoints for rank 89 -loading 8 zero partition checkpoints for rank 114 -loading 8 zero partition checkpoints for rank 152 -loading 8 zero partition checkpoints for rank 42 -loading 8 zero partition checkpoints for rank 108 -loading 8 zero partition checkpoints for rank 228 -loading 8 zero partition checkpoints for rank 206 -loading 8 zero partition checkpoints for rank 33 -loading 8 zero partition checkpoints for rank 41 -loading 8 zero partition checkpoints for rank 135 -loading 8 zero partition checkpoints for rank 71 -loading 8 zero partition checkpoints for rank 222 -loading 8 zero partition checkpoints for rank 62 -loading 8 zero partition checkpoints for rank 134 -successfully loaded 8 ZeRO state_dicts for rank 11 -loading 8 zero partition checkpoints for rank 129 -loading 8 zero partition checkpoints for rank 126 -loading 8 zero partition checkpoints for rank 192 -loading 8 zero partition checkpoints for rank 153 -loading 8 zero partition checkpoints for rank 202 -loading 8 zero partition checkpoints for rank 128 -loading 8 zero partition checkpoints for rank 84 -loading 8 zero partition checkpoints for rank 141 -loading 8 zero partition checkpoints for rank 45 -loading 8 zero partition checkpoints for rank 115 -loading 8 zero partition checkpoints for rank 56 -loading 8 zero partition checkpoints for rank 111 -loading 8 zero partition checkpoints for rank 121 -loading 8 zero partition checkpoints for rank 130 -loading 8 zero partition checkpoints for rank 20 -loading 8 zero partition checkpoints for rank 133 -loading 8 zero partition checkpoints for rank 38 -loading 8 zero partition checkpoints for rank 122 -loading 8 zero partition checkpoints for rank 97 -loading 8 zero partition checkpoints for rank 158 -loading 8 zero partition checkpoints for rank 85 -loading 8 zero partition checkpoints for rank 157 -loading 8 zero partition checkpoints for rank 78 -loading 8 zero partition checkpoints for rank 162 -loading 8 zero partition checkpoints for rank 191 -loading 8 zero partition checkpoints for rank 65 -loading 8 zero partition checkpoints for rank 44 -loading 8 zero partition checkpoints for rank 82 -loading 8 zero partition checkpoints for rank 98 -loading 8 zero partition checkpoints for rank 63 -loading 8 zero partition checkpoints for rank 12 -loading 8 zero partition checkpoints for rank 113 -loading 8 zero partition checkpoints for rank 188 -loading 8 zero partition checkpoints for rank 151 -loading 8 zero partition checkpoints for rank 146 -loading 8 zero partition checkpoints for rank 36 -loading 8 zero partition checkpoints for rank 123 -loading 8 zero partition checkpoints for rank 210 -loading 8 zero partition checkpoints for rank 37 -loading 8 zero partition checkpoints for rank 119 -loading 8 zero partition checkpoints for rank 197 -loading 8 zero partition checkpoints for rank 223 -loading 8 zero partition checkpoints for rank 52 -loading 8 zero partition checkpoints for rank 179 -loading 8 zero partition checkpoints for rank 76 -loading 8 zero partition checkpoints for rank 218 -loading 8 zero partition checkpoints for rank 219 -loading 8 zero partition checkpoints for rank 35 -loading 8 zero partition checkpoints for rank 107 -loading 8 zero partition checkpoints for rank 163 -loading 8 zero partition checkpoints for rank 43 -loading 8 zero partition checkpoints for rank 212 -loading 8 zero partition checkpoints for rank 49 -loading 8 zero partition checkpoints for rank 208 -loading 8 zero partition checkpoints for rank 181 -loading 8 zero partition checkpoints for rank 91 -loading 8 zero partition checkpoints for rank 185 -loading 8 zero partition checkpoints for rank 214 -loading 8 zero partition checkpoints for rank 53 -loading 8 zero partition checkpoints for rank 75 -loading 8 zero partition checkpoints for rank 46 -loading 8 zero partition checkpoints for rank 165 -loading 8 zero partition checkpoints for rank 57 -loading 8 zero partition checkpoints for rank 211 -loading 8 zero partition checkpoints for rank 180 -loading 8 zero partition checkpoints for rank 55 -loading 8 zero partition checkpoints for rank 217 -loading 8 zero partition checkpoints for rank 92 -loading 8 zero partition checkpoints for rank 61 -loading 8 zero partition checkpoints for rank 110 -loading 8 zero partition checkpoints for rank 196 -loading 8 zero partition checkpoints for rank 205 -loading 8 zero partition checkpoints for rank 83 -loading 8 zero partition checkpoints for rank 25 -loading 8 zero partition checkpoints for rank 68 -loading 8 zero partition checkpoints for rank 195 -loading 8 zero partition checkpoints for rank 118 -loading 8 zero partition checkpoints for rank 79 -loading 8 zero partition checkpoints for rank 155 -loading 8 zero partition checkpoints for rank 184 -loading 8 zero partition checkpoints for rank 94 -loading 8 zero partition checkpoints for rank 39 -loading 8 zero partition checkpoints for rank 27 -loading 8 zero partition checkpoints for rank 21 -loading 8 zero partition checkpoints for rank 58 -loading 8 zero partition checkpoints for rank 103 -loading 8 zero partition checkpoints for rank 100 -loading 8 zero partition checkpoints for rank 101 -loading 8 zero partition checkpoints for rank 154 -loading 8 zero partition checkpoints for rank 131 -loading 8 zero partition checkpoints for rank 145 -loading 8 zero partition checkpoints for rank 0 -loading 8 zero partition checkpoints for rank 136 - checkpoint version 3.0 -loading 8 zero partition checkpoints for rank 48 -loading 8 zero partition checkpoints for rank 51 -loading 8 zero partition checkpoints for rank 29 -loading 8 zero partition checkpoints for rank 109 -loading 8 zero partition checkpoints for rank 213 -loading 8 zero partition checkpoints for rank 93 -loading 8 zero partition checkpoints for rank 183 -loading 8 zero partition checkpoints for rank 72 -loading 8 zero partition checkpoints for rank 59 -loading 8 zero partition checkpoints for rank 200 -loading 8 zero partition checkpoints for rank 73 -loading 8 zero partition checkpoints for rank 142 -loading 8 zero partition checkpoints for rank 182 -loading 8 zero partition checkpoints for rank 70 -loading 8 zero partition checkpoints for rank 161 -loading 8 zero partition checkpoints for rank 150 -loading 8 zero partition checkpoints for rank 5 -loading 8 zero partition checkpoints for rank 203 -loading 8 zero partition checkpoints for rank 194 -loading 8 zero partition checkpoints for rank 190 -loading 8 zero partition checkpoints for rank 6 -loading 8 zero partition checkpoints for rank 54 -loading 8 zero partition checkpoints for rank 47 -loading 8 zero partition checkpoints for rank 221 -loading 8 zero partition checkpoints for rank 4 -loading 8 zero partition checkpoints for rank 138 -loading 8 zero partition checkpoints for rank 50 -loading 8 zero partition checkpoints for rank 3 -loading 8 zero partition checkpoints for rank 177 -loading 8 zero partition checkpoints for rank 30 -loading 8 zero partition checkpoints for rank 15 -loading 8 zero partition checkpoints for rank 166 -loading 8 zero partition checkpoints for rank 226 -loading 8 zero partition checkpoints for rank 238 -loading 8 zero partition checkpoints for rank 207 -loading 8 zero partition checkpoints for rank 22 -loading 8 zero partition checkpoints for rank 147 -loading 8 zero partition checkpoints for rank 87 -loading 8 zero partition checkpoints for rank 178 -loading 8 zero partition checkpoints for rank 172 -loading 8 zero partition checkpoints for rank 204 -loading 8 zero partition checkpoints for rank 66 -loading 8 zero partition checkpoints for rank 250 -loading 8 zero partition checkpoints for rank 220 -loading 8 zero partition checkpoints for rank 254 -loading 8 zero partition checkpoints for rank 95 -loading 8 zero partition checkpoints for rank 239 -loading 8 zero partition checkpoints for rank 24 -loading 8 zero partition checkpoints for rank 86 -loading 8 zero partition checkpoints for rank 189 -loading 8 zero partition checkpoints for rank 229 -loading 8 zero partition checkpoints for rank 241 -loading 8 zero partition checkpoints for rank 240 -loading 8 zero partition checkpoints for rank 253 -loading 8 zero partition checkpoints for rank 199 -loading 8 zero partition checkpoints for rank 67 -loading 8 zero partition checkpoints for rank 175 -loading 8 zero partition checkpoints for rank 225 -loading 8 zero partition checkpoints for rank 164 -loading 8 zero partition checkpoints for rank 246 -loading 8 zero partition checkpoints for rank 236 -loading 8 zero partition checkpoints for rank 198 -loading 8 zero partition checkpoints for rank 247 -loading 8 zero partition checkpoints for rank 233 -loading 8 zero partition checkpoints for rank 116 -loading 8 zero partition checkpoints for rank 7 -loading 8 zero partition checkpoints for rank 248 -loading 8 zero partition checkpoints for rank 232 -loading 8 zero partition checkpoints for rank 230 -loading 8 zero partition checkpoints for rank 173 -loading 8 zero partition checkpoints for rank 231 -loading 8 zero partition checkpoints for rank 244 -loading 8 zero partition checkpoints for rank 117 -loading 8 zero partition checkpoints for rank 102 -loading 8 zero partition checkpoints for rank 26 -loading 8 zero partition checkpoints for rank 23 -loading 8 zero partition checkpoints for rank 245 -loading 8 zero partition checkpoints for rank 237 -loading 8 zero partition checkpoints for rank 227 -loading 8 zero partition checkpoints for rank 28 -loading 8 zero partition checkpoints for rank 252 -loading 8 zero partition checkpoints for rank 13 -loading 8 zero partition checkpoints for rank 1 -loading 8 zero partition checkpoints for rank 174 -loading 8 zero partition checkpoints for rank 242 -loading 8 zero partition checkpoints for rank 224 -loading 8 zero partition checkpoints for rank 2 -loading 8 zero partition checkpoints for rank 31 -loading 8 zero partition checkpoints for rank 243 -loading 8 zero partition checkpoints for rank 14 -loading 8 zero partition checkpoints for rank 234 -loading 8 zero partition checkpoints for rank 255 -loading 8 zero partition checkpoints for rank 235 -loading 8 zero partition checkpoints for rank 251 -loading 8 zero partition checkpoints for rank 10 -loading 8 zero partition checkpoints for rank 249 -loading 8 zero partition checkpoints for rank 9 -loading 8 zero partition checkpoints for rank 8 -loading 8 zero partition checkpoints for rank 11 -successfully loaded 8 ZeRO state_dicts for rank 17 -successfully loaded 8 ZeRO state_dicts for rank 19 -successfully loaded 8 ZeRO state_dicts for rank 18 -successfully loaded 8 ZeRO state_dicts for rank 16 -loading 8 zero partition checkpoints for rank 17 -loading 8 zero partition checkpoints for rank 19 -loading 8 zero partition checkpoints for rank 18 -loading 8 zero partition checkpoints for rank 16 - successfully loaded checkpoint from /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints at iteration 942 -time (ms) | load-checkpoint: 82978.97 -[after model, optimizer, and learning rate scheduler are built] datetime: 2021-09-24 05:54:15 -> building train, validation, and test datasets ... - > datasets target sizes (minimum size): - train: 300000000 - validation: 1638400 - test: 10240 -> building train, validation, and test datasets for GPT ... - > building dataset index ... - reading sizes... - reading pointers... - reading document index... - creating numpy buffer of mmap... - creating memory view of numpy buffer... - > finished creating indexed dataset in 0.135933 seconds - number of documents: 304230423 - > dataset split: - train: - document indices in [0, 288714672) total of 288714672 documents - validation: - document indices in [288714672, 303926193) total of 15211521 documents - test: - document indices in [303926193, 304230423) total of 304230 documents - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.348 seconds - total number of samples: 394611670 - total number of epochs: 3 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.321 seconds - total number of samples: 6927161 - total number of epochs: 1 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.062 seconds - total number of samples: 137384 - total number of epochs: 1 -> finished creating GPT datasets ... -[after dataloaders are built] datetime: 2021-09-24 05:54:21 -done with setup ... -training ... -time (ms) | model-and-optimizer-setup: 91017.54 | train/valid/test-data-iterators-setup: 4740.91 -[before the start of training step] datetime: 2021-09-24 05:54:21 -[2021-09-24 05:54:21,235] [INFO] [checkpointing.py:408:forward] Activation Checkpointing Information -[2021-09-24 05:54:21,235] [INFO] [checkpointing.py:409:forward] ----Partition Activations False, CPU CHECKPOINTING False -[2021-09-24 05:54:21,235] [INFO] [checkpointing.py:412:forward] ----contiguous Memory Checkpointing False with 32 total layers -[2021-09-24 05:54:21,235] [INFO] [checkpointing.py:415:forward] ----Synchronization False -[2021-09-24 05:54:21,235] [INFO] [checkpointing.py:416:forward] ----Profiling time in checkpointing False -[Rank 1] (after 943 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 22890.0 | max reserved: 22890.0 -[Rank 225] (after 943 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 22108.0 | max reserved: 22108.0 -[Rank 65] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 33] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 97] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 129] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 193] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 161] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 2] (after 943 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 21150.0 | max reserved: 21150.0 -[Rank 34] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 226] (after 943 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 21700.0 | max reserved: 21700.0 -[Rank 66] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18586.0 | max reserved: 18586.0 -[Rank 98] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 162] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 130] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18458.0 | max reserved: 18458.0 -[Rank 194] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18826.0 | max reserved: 18826.0 -[Rank 0] (after 943 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 23526.0 | max reserved: 23526.0 -[Rank 32] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 64] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 224] (after 943 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 96] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18948.0 | max reserved: 18948.0 -[Rank 128] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 192] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19076.0 | max reserved: 19076.0 -[Rank 160] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 3] (after 943 iterations) memory (MB) | allocated: 6661.611328125 | max allocated: 11742.55810546875 | reserved: 21150.0 | max reserved: 21150.0 -[Rank 35] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18826.0 | max reserved: 18826.0 -[Rank 227] (after 943 iterations) memory (MB) | allocated: 7107.70751953125 | max allocated: 11884.6845703125 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 67] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18458.0 | max reserved: 18458.0 -[Rank 99] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18522.0 | max reserved: 18522.0 -[Rank 163] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 131] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18442.0 | max reserved: 18442.0 -[Rank 195] (after 943 iterations) memory (MB) | allocated: 5861.5498046875 | max allocated: 10450.46337890625 | reserved: 18826.0 | max reserved: 18826.0 - iteration 943/ 159576 | consumed samples: 15088 | elapsed time per iteration (ms): 29806.1 | learning rate: 4.185E-06 | global batch size: 16 | lm loss: 7.642442E+00 | loss scale: 8192.0 | grad norm: 53639.718 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 944/ 159576 | consumed samples: 15104 | elapsed time per iteration (ms): 13012.2 | learning rate: 4.189E-06 | global batch size: 16 | lm loss: 7.638637E+00 | loss scale: 8192.0 | grad norm: 47002.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 945/ 159576 | consumed samples: 15120 | elapsed time per iteration (ms): 13551.8 | learning rate: 4.194E-06 | global batch size: 16 | lm loss: 7.559312E+00 | loss scale: 8192.0 | grad norm: 43680.206 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 946/ 159576 | consumed samples: 15136 | elapsed time per iteration (ms): 13672.0 | learning rate: 4.198E-06 | global batch size: 16 | lm loss: 7.372701E+00 | loss scale: 8192.0 | grad norm: 29642.562 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 947/ 159576 | consumed samples: 15152 | elapsed time per iteration (ms): 13523.5 | learning rate: 4.203E-06 | global batch size: 16 | lm loss: 7.431667E+00 | loss scale: 8192.0 | grad norm: 71525.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 948/ 159576 | consumed samples: 15168 | elapsed time per iteration (ms): 13571.1 | learning rate: 4.207E-06 | global batch size: 16 | lm loss: 7.622519E+00 | loss scale: 8192.0 | grad norm: 108314.372 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 949/ 159576 | consumed samples: 15184 | elapsed time per iteration (ms): 13513.7 | learning rate: 4.212E-06 | global batch size: 16 | lm loss: 7.491040E+00 | loss scale: 8192.0 | grad norm: 83775.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 950/ 159576 | consumed samples: 15200 | elapsed time per iteration (ms): 13857.2 | learning rate: 4.216E-06 | global batch size: 16 | lm loss: 7.689845E+00 | loss scale: 8192.0 | grad norm: 42694.796 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 951/ 159576 | consumed samples: 15216 | elapsed time per iteration (ms): 13556.0 | learning rate: 4.220E-06 | global batch size: 16 | lm loss: 7.541234E+00 | loss scale: 8192.0 | grad norm: 36744.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 952/ 159576 | consumed samples: 15232 | elapsed time per iteration (ms): 13565.0 | learning rate: 4.225E-06 | global batch size: 16 | lm loss: 7.402619E+00 | loss scale: 8192.0 | grad norm: 37335.008 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 953/ 159576 | consumed samples: 15248 | elapsed time per iteration (ms): 13600.8 | learning rate: 4.229E-06 | global batch size: 16 | lm loss: 7.524664E+00 | loss scale: 8192.0 | grad norm: 36490.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 954/ 159576 | consumed samples: 15264 | elapsed time per iteration (ms): 13538.1 | learning rate: 4.234E-06 | global batch size: 16 | lm loss: 6.926525E+00 | loss scale: 8192.0 | grad norm: 28573.010 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 955/ 159576 | consumed samples: 15280 | elapsed time per iteration (ms): 13767.3 | learning rate: 4.238E-06 | global batch size: 16 | lm loss: 7.564863E+00 | loss scale: 8192.0 | grad norm: 45556.471 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 956/ 159576 | consumed samples: 15296 | elapsed time per iteration (ms): 13529.6 | learning rate: 4.243E-06 | global batch size: 16 | lm loss: 7.518897E+00 | loss scale: 8192.0 | grad norm: 40483.089 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 957/ 159576 | consumed samples: 15312 | elapsed time per iteration (ms): 13548.2 | learning rate: 4.247E-06 | global batch size: 16 | lm loss: 7.292015E+00 | loss scale: 8192.0 | grad norm: 27123.950 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 958/ 159576 | consumed samples: 15328 | elapsed time per iteration (ms): 13592.2 | learning rate: 4.251E-06 | global batch size: 16 | lm loss: 7.645267E+00 | loss scale: 8192.0 | grad norm: 45895.591 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 959/ 159576 | consumed samples: 15344 | elapsed time per iteration (ms): 13834.7 | learning rate: 4.256E-06 | global batch size: 16 | lm loss: 7.439256E+00 | loss scale: 8192.0 | grad norm: 47827.958 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 960/ 159576 | consumed samples: 15360 | elapsed time per iteration (ms): 13548.7 | learning rate: 4.260E-06 | global batch size: 16 | lm loss: 7.398325E+00 | loss scale: 8192.0 | grad norm: 41514.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 961/ 159576 | consumed samples: 15376 | elapsed time per iteration (ms): 13540.1 | learning rate: 4.265E-06 | global batch size: 16 | lm loss: 7.498395E+00 | loss scale: 8192.0 | grad norm: 24323.912 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 962/ 159576 | consumed samples: 15392 | elapsed time per iteration (ms): 13596.3 | learning rate: 4.269E-06 | global batch size: 16 | lm loss: 7.458749E+00 | loss scale: 8192.0 | grad norm: 37806.541 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 963/ 159576 | consumed samples: 15408 | elapsed time per iteration (ms): 13925.1 | learning rate: 4.274E-06 | global batch size: 16 | lm loss: 7.414832E+00 | loss scale: 8192.0 | grad norm: 38291.446 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 964/ 159576 | consumed samples: 15424 | elapsed time per iteration (ms): 13505.9 | learning rate: 4.278E-06 | global batch size: 16 | lm loss: 7.552760E+00 | loss scale: 8192.0 | grad norm: 23290.618 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 965/ 159576 | consumed samples: 15440 | elapsed time per iteration (ms): 13598.7 | learning rate: 4.283E-06 | global batch size: 16 | lm loss: 7.566991E+00 | loss scale: 8192.0 | grad norm: 33429.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 966/ 159576 | consumed samples: 15456 | elapsed time per iteration (ms): 13495.5 | learning rate: 4.287E-06 | global batch size: 16 | lm loss: 7.727429E+00 | loss scale: 8192.0 | grad norm: 33196.940 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 967/ 159576 | consumed samples: 15472 | elapsed time per iteration (ms): 13508.3 | learning rate: 4.291E-06 | global batch size: 16 | lm loss: 7.517751E+00 | loss scale: 8192.0 | grad norm: 25674.592 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 968/ 159576 | consumed samples: 15488 | elapsed time per iteration (ms): 13747.8 | learning rate: 4.296E-06 | global batch size: 16 | lm loss: 7.534285E+00 | loss scale: 8192.0 | grad norm: 28899.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 969/ 159576 | consumed samples: 15504 | elapsed time per iteration (ms): 13541.9 | learning rate: 4.300E-06 | global batch size: 16 | lm loss: 7.412315E+00 | loss scale: 8192.0 | grad norm: 23856.723 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 970/ 159576 | consumed samples: 15520 | elapsed time per iteration (ms): 13581.6 | learning rate: 4.305E-06 | global batch size: 16 | lm loss: 7.574214E+00 | loss scale: 8192.0 | grad norm: 26912.399 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 971/ 159576 | consumed samples: 15536 | elapsed time per iteration (ms): 13575.2 | learning rate: 4.309E-06 | global batch size: 16 | lm loss: 7.489717E+00 | loss scale: 8192.0 | grad norm: 25683.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 972/ 159576 | consumed samples: 15552 | elapsed time per iteration (ms): 14047.8 | learning rate: 4.314E-06 | global batch size: 16 | lm loss: 7.479139E+00 | loss scale: 8192.0 | grad norm: 23963.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 973/ 159576 | consumed samples: 15568 | elapsed time per iteration (ms): 13519.1 | learning rate: 4.318E-06 | global batch size: 16 | lm loss: 7.557629E+00 | loss scale: 8192.0 | grad norm: 28281.687 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 974/ 159576 | consumed samples: 15584 | elapsed time per iteration (ms): 13508.3 | learning rate: 4.322E-06 | global batch size: 16 | lm loss: 7.324095E+00 | loss scale: 8192.0 | grad norm: 24628.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 975/ 159576 | consumed samples: 15600 | elapsed time per iteration (ms): 13557.4 | learning rate: 4.327E-06 | global batch size: 16 | lm loss: 7.551218E+00 | loss scale: 8192.0 | grad norm: 22604.906 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 976/ 159576 | consumed samples: 15616 | elapsed time per iteration (ms): 13573.2 | learning rate: 4.331E-06 | global batch size: 16 | lm loss: 7.421384E+00 | loss scale: 8192.0 | grad norm: 25754.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 977/ 159576 | consumed samples: 15632 | elapsed time per iteration (ms): 13891.1 | learning rate: 4.336E-06 | global batch size: 16 | lm loss: 7.421275E+00 | loss scale: 8192.0 | grad norm: 23427.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 978/ 159576 | consumed samples: 15648 | elapsed time per iteration (ms): 13578.3 | learning rate: 4.340E-06 | global batch size: 16 | lm loss: 7.468715E+00 | loss scale: 8192.0 | grad norm: 25697.467 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 979/ 159576 | consumed samples: 15664 | elapsed time per iteration (ms): 13602.5 | learning rate: 4.345E-06 | global batch size: 16 | lm loss: 7.679566E+00 | loss scale: 8192.0 | grad norm: 25403.982 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 980/ 159576 | consumed samples: 15680 | elapsed time per iteration (ms): 13628.8 | learning rate: 4.349E-06 | global batch size: 16 | lm loss: 7.442289E+00 | loss scale: 8192.0 | grad norm: 30230.032 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 981/ 159576 | consumed samples: 15696 | elapsed time per iteration (ms): 13812.5 | learning rate: 4.354E-06 | global batch size: 16 | lm loss: 7.521616E+00 | loss scale: 8192.0 | grad norm: 29030.478 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 982/ 159576 | consumed samples: 15712 | elapsed time per iteration (ms): 13617.0 | learning rate: 4.358E-06 | global batch size: 16 | lm loss: 7.595479E+00 | loss scale: 8192.0 | grad norm: 32518.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 06:03:44] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1162855_[2-10%1] on 'gpu_p13' partition) -[2021-09-24 06:03:44] PULSE: tr8-104B is running for 11:33 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 983/ 159576 | consumed samples: 15728 | elapsed time per iteration (ms): 13560.9 | learning rate: 4.362E-06 | global batch size: 16 | lm loss: 7.437976E+00 | loss scale: 8192.0 | grad norm: 25658.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 984/ 159576 | consumed samples: 15744 | elapsed time per iteration (ms): 13555.5 | learning rate: 4.367E-06 | global batch size: 16 | lm loss: 7.561976E+00 | loss scale: 8192.0 | grad norm: 28146.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 985/ 159576 | consumed samples: 15760 | elapsed time per iteration (ms): 13993.9 | learning rate: 4.371E-06 | global batch size: 16 | lm loss: 7.526425E+00 | loss scale: 8192.0 | grad norm: 22789.409 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 986/ 159576 | consumed samples: 15776 | elapsed time per iteration (ms): 13819.4 | learning rate: 4.376E-06 | global batch size: 16 | lm loss: 7.568769E+00 | loss scale: 8192.0 | grad norm: 29742.595 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 987/ 159576 | consumed samples: 15792 | elapsed time per iteration (ms): 13655.7 | learning rate: 4.380E-06 | global batch size: 16 | lm loss: 7.516987E+00 | loss scale: 8192.0 | grad norm: 29352.083 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 988/ 159576 | consumed samples: 15808 | elapsed time per iteration (ms): 13528.1 | learning rate: 4.385E-06 | global batch size: 16 | lm loss: 7.482485E+00 | loss scale: 8192.0 | grad norm: 23020.708 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 989/ 159576 | consumed samples: 15824 | elapsed time per iteration (ms): 13534.2 | learning rate: 4.389E-06 | global batch size: 16 | lm loss: 7.601320E+00 | loss scale: 8192.0 | grad norm: 23202.245 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 990/ 159576 | consumed samples: 15840 | elapsed time per iteration (ms): 13617.6 | learning rate: 4.393E-06 | global batch size: 16 | lm loss: 7.522967E+00 | loss scale: 8192.0 | grad norm: 26298.479 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 991/ 159576 | consumed samples: 15856 | elapsed time per iteration (ms): 13569.7 | learning rate: 4.398E-06 | global batch size: 16 | lm loss: 7.564295E+00 | loss scale: 8192.0 | grad norm: 30127.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 992/ 159576 | consumed samples: 15872 | elapsed time per iteration (ms): 13596.4 | learning rate: 4.402E-06 | global batch size: 16 | lm loss: 7.530395E+00 | loss scale: 8192.0 | grad norm: 25061.967 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 993/ 159576 | consumed samples: 15888 | elapsed time per iteration (ms): 13641.4 | learning rate: 4.407E-06 | global batch size: 16 | lm loss: 7.547958E+00 | loss scale: 8192.0 | grad norm: 24314.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 994/ 159576 | consumed samples: 15904 | elapsed time per iteration (ms): 13912.4 | learning rate: 4.411E-06 | global batch size: 16 | lm loss: 7.429228E+00 | loss scale: 8192.0 | grad norm: 28339.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 995/ 159576 | consumed samples: 15920 | elapsed time per iteration (ms): 13541.6 | learning rate: 4.416E-06 | global batch size: 16 | lm loss: 7.511089E+00 | loss scale: 8192.0 | grad norm: 27156.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 996/ 159576 | consumed samples: 15936 | elapsed time per iteration (ms): 13577.4 | learning rate: 4.420E-06 | global batch size: 16 | lm loss: 7.332575E+00 | loss scale: 8192.0 | grad norm: 26750.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 997/ 159576 | consumed samples: 15952 | elapsed time per iteration (ms): 13524.5 | learning rate: 4.425E-06 | global batch size: 16 | lm loss: 7.478838E+00 | loss scale: 8192.0 | grad norm: 30934.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 998/ 159576 | consumed samples: 15968 | elapsed time per iteration (ms): 13570.2 | learning rate: 4.429E-06 | global batch size: 16 | lm loss: 7.363966E+00 | loss scale: 8192.0 | grad norm: 26717.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 999/ 159576 | consumed samples: 15984 | elapsed time per iteration (ms): 13808.8 | learning rate: 4.433E-06 | global batch size: 16 | lm loss: 7.504936E+00 | loss scale: 8192.0 | grad norm: 33504.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1000/ 159576 | consumed samples: 16000 | elapsed time per iteration (ms): 13740.5 | learning rate: 4.438E-06 | global batch size: 16 | lm loss: 7.441235E+00 | loss scale: 16384.0 | grad norm: 39922.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 1000 | lm loss value: 7.422922E+00 | lm loss PPL: 1.673917E+03 | ------------------------------------------------------------------------------------------------- - iteration 1001/ 159576 | consumed samples: 16016 | elapsed time per iteration (ms): 18607.4 | learning rate: 4.442E-06 | global batch size: 16 | lm loss: 7.375732E+00 | loss scale: 16384.0 | grad norm: 55247.055 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1002/ 159576 | consumed samples: 16032 | elapsed time per iteration (ms): 13593.5 | learning rate: 4.447E-06 | global batch size: 16 | lm loss: 7.377642E+00 | loss scale: 16384.0 | grad norm: 69178.499 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1003/ 159576 | consumed samples: 16048 | elapsed time per iteration (ms): 13772.4 | learning rate: 4.451E-06 | global batch size: 16 | lm loss: 7.399412E+00 | loss scale: 16384.0 | grad norm: 56841.570 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1004/ 159576 | consumed samples: 16064 | elapsed time per iteration (ms): 13547.9 | learning rate: 4.456E-06 | global batch size: 16 | lm loss: 7.476449E+00 | loss scale: 16384.0 | grad norm: 53109.525 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1005/ 159576 | consumed samples: 16080 | elapsed time per iteration (ms): 13546.4 | learning rate: 4.460E-06 | global batch size: 16 | lm loss: 7.394112E+00 | loss scale: 16384.0 | grad norm: 62368.875 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1006/ 159576 | consumed samples: 16096 | elapsed time per iteration (ms): 13685.8 | learning rate: 4.464E-06 | global batch size: 16 | lm loss: 7.426886E+00 | loss scale: 16384.0 | grad norm: 57003.932 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1007/ 159576 | consumed samples: 16112 | elapsed time per iteration (ms): 14078.3 | learning rate: 4.469E-06 | global batch size: 16 | lm loss: 7.601004E+00 | loss scale: 16384.0 | grad norm: 62664.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1008/ 159576 | consumed samples: 16128 | elapsed time per iteration (ms): 13787.6 | learning rate: 4.473E-06 | global batch size: 16 | lm loss: 7.774883E+00 | loss scale: 16384.0 | grad norm: 97296.354 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1009/ 159576 | consumed samples: 16144 | elapsed time per iteration (ms): 13687.7 | learning rate: 4.478E-06 | global batch size: 16 | lm loss: 7.604346E+00 | loss scale: 16384.0 | grad norm: 65941.448 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1010/ 159576 | consumed samples: 16160 | elapsed time per iteration (ms): 13703.4 | learning rate: 4.482E-06 | global batch size: 16 | lm loss: 7.360181E+00 | loss scale: 16384.0 | grad norm: 64245.298 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1011/ 159576 | consumed samples: 16176 | elapsed time per iteration (ms): 14077.4 | learning rate: 4.487E-06 | global batch size: 16 | lm loss: 7.590093E+00 | loss scale: 16384.0 | grad norm: 66963.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1012/ 159576 | consumed samples: 16192 | elapsed time per iteration (ms): 13697.2 | learning rate: 4.491E-06 | global batch size: 16 | lm loss: 7.648331E+00 | loss scale: 16384.0 | grad norm: 62407.028 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1013/ 159576 | consumed samples: 16208 | elapsed time per iteration (ms): 13676.8 | learning rate: 4.496E-06 | global batch size: 16 | lm loss: 7.462048E+00 | loss scale: 16384.0 | grad norm: 76557.598 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1014/ 159576 | consumed samples: 16224 | elapsed time per iteration (ms): 13713.9 | learning rate: 4.500E-06 | global batch size: 16 | lm loss: 7.345057E+00 | loss scale: 16384.0 | grad norm: 58991.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1015/ 159576 | consumed samples: 16240 | elapsed time per iteration (ms): 13740.6 | learning rate: 4.504E-06 | global batch size: 16 | lm loss: 7.369339E+00 | loss scale: 16384.0 | grad norm: 76798.488 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1016/ 159576 | consumed samples: 16256 | elapsed time per iteration (ms): 13921.9 | learning rate: 4.509E-06 | global batch size: 16 | lm loss: 7.564117E+00 | loss scale: 16384.0 | grad norm: 64166.866 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1017/ 159576 | consumed samples: 16272 | elapsed time per iteration (ms): 13632.9 | learning rate: 4.513E-06 | global batch size: 16 | lm loss: 7.610378E+00 | loss scale: 16384.0 | grad norm: 65353.003 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1018/ 159576 | consumed samples: 16288 | elapsed time per iteration (ms): 13686.4 | learning rate: 4.518E-06 | global batch size: 16 | lm loss: 7.676594E+00 | loss scale: 16384.0 | grad norm: 64547.303 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1019/ 159576 | consumed samples: 16304 | elapsed time per iteration (ms): 13717.6 | learning rate: 4.522E-06 | global batch size: 16 | lm loss: 7.406422E+00 | loss scale: 16384.0 | grad norm: 63594.322 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1020/ 159576 | consumed samples: 16320 | elapsed time per iteration (ms): 13939.6 | learning rate: 4.527E-06 | global batch size: 16 | lm loss: 7.459125E+00 | loss scale: 16384.0 | grad norm: 59823.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1021/ 159576 | consumed samples: 16336 | elapsed time per iteration (ms): 13792.3 | learning rate: 4.531E-06 | global batch size: 16 | lm loss: 7.471806E+00 | loss scale: 16384.0 | grad norm: 56872.925 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1022/ 159576 | consumed samples: 16352 | elapsed time per iteration (ms): 13687.8 | learning rate: 4.536E-06 | global batch size: 16 | lm loss: 7.110139E+00 | loss scale: 16384.0 | grad norm: 58937.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1023/ 159576 | consumed samples: 16368 | elapsed time per iteration (ms): 13711.6 | learning rate: 4.540E-06 | global batch size: 16 | lm loss: 7.428498E+00 | loss scale: 16384.0 | grad norm: 57885.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1024/ 159576 | consumed samples: 16384 | elapsed time per iteration (ms): 14207.9 | learning rate: 4.544E-06 | global batch size: 16 | lm loss: 7.374810E+00 | loss scale: 16384.0 | grad norm: 56855.393 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1025/ 159576 | consumed samples: 16400 | elapsed time per iteration (ms): 13557.2 | learning rate: 4.549E-06 | global batch size: 16 | lm loss: 7.597025E+00 | loss scale: 16384.0 | grad norm: 57119.291 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1026/ 159576 | consumed samples: 16416 | elapsed time per iteration (ms): 13700.8 | learning rate: 4.553E-06 | global batch size: 16 | lm loss: 7.473170E+00 | loss scale: 16384.0 | grad norm: 61762.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1027/ 159576 | consumed samples: 16432 | elapsed time per iteration (ms): 13696.5 | learning rate: 4.558E-06 | global batch size: 16 | lm loss: 7.410631E+00 | loss scale: 16384.0 | grad norm: 63393.977 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1028/ 159576 | consumed samples: 16448 | elapsed time per iteration (ms): 13664.5 | learning rate: 4.562E-06 | global batch size: 16 | lm loss: 7.475993E+00 | loss scale: 16384.0 | grad norm: 61819.228 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1029/ 159576 | consumed samples: 16464 | elapsed time per iteration (ms): 13836.3 | learning rate: 4.567E-06 | global batch size: 16 | lm loss: 7.464800E+00 | loss scale: 16384.0 | grad norm: 52336.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1030/ 159576 | consumed samples: 16480 | elapsed time per iteration (ms): 13692.5 | learning rate: 4.571E-06 | global batch size: 16 | lm loss: 7.449406E+00 | loss scale: 16384.0 | grad norm: 66491.596 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1031/ 159576 | consumed samples: 16496 | elapsed time per iteration (ms): 13635.2 | learning rate: 4.575E-06 | global batch size: 16 | lm loss: 7.519850E+00 | loss scale: 16384.0 | grad norm: 65780.303 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1032/ 159576 | consumed samples: 16512 | elapsed time per iteration (ms): 13708.9 | learning rate: 4.580E-06 | global batch size: 16 | lm loss: 7.513804E+00 | loss scale: 16384.0 | grad norm: 62434.258 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1033/ 159576 | consumed samples: 16528 | elapsed time per iteration (ms): 13952.8 | learning rate: 4.584E-06 | global batch size: 16 | lm loss: 7.405169E+00 | loss scale: 16384.0 | grad norm: 74264.401 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1034/ 159576 | consumed samples: 16544 | elapsed time per iteration (ms): 13788.4 | learning rate: 4.589E-06 | global batch size: 16 | lm loss: 7.367761E+00 | loss scale: 16384.0 | grad norm: 75791.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1035/ 159576 | consumed samples: 16560 | elapsed time per iteration (ms): 13716.5 | learning rate: 4.593E-06 | global batch size: 16 | lm loss: 7.513783E+00 | loss scale: 16384.0 | grad norm: 91765.458 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1036/ 159576 | consumed samples: 16576 | elapsed time per iteration (ms): 13658.1 | learning rate: 4.598E-06 | global batch size: 16 | lm loss: 7.556536E+00 | loss scale: 16384.0 | grad norm: 76354.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1037/ 159576 | consumed samples: 16592 | elapsed time per iteration (ms): 13995.5 | learning rate: 4.602E-06 | global batch size: 16 | lm loss: 7.423755E+00 | loss scale: 16384.0 | grad norm: 70528.206 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1038/ 159576 | consumed samples: 16608 | elapsed time per iteration (ms): 13797.2 | learning rate: 4.607E-06 | global batch size: 16 | lm loss: 7.452043E+00 | loss scale: 16384.0 | grad norm: 63200.280 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1039/ 159576 | consumed samples: 16624 | elapsed time per iteration (ms): 13728.6 | learning rate: 4.611E-06 | global batch size: 16 | lm loss: 7.310857E+00 | loss scale: 16384.0 | grad norm: 135045.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1040/ 159576 | consumed samples: 16640 | elapsed time per iteration (ms): 13690.2 | learning rate: 4.615E-06 | global batch size: 16 | lm loss: 7.374257E+00 | loss scale: 16384.0 | grad norm: 69159.214 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1041/ 159576 | consumed samples: 16656 | elapsed time per iteration (ms): 13682.9 | learning rate: 4.620E-06 | global batch size: 16 | lm loss: 7.498551E+00 | loss scale: 16384.0 | grad norm: 67982.272 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1042/ 159576 | consumed samples: 16672 | elapsed time per iteration (ms): 13991.8 | learning rate: 4.624E-06 | global batch size: 16 | lm loss: 7.373695E+00 | loss scale: 16384.0 | grad norm: 75175.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1043/ 159576 | consumed samples: 16688 | elapsed time per iteration (ms): 13721.4 | learning rate: 4.629E-06 | global batch size: 16 | lm loss: 7.642927E+00 | loss scale: 16384.0 | grad norm: 103318.209 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1044/ 159576 | consumed samples: 16704 | elapsed time per iteration (ms): 13718.3 | learning rate: 4.633E-06 | global batch size: 16 | lm loss: 7.423826E+00 | loss scale: 16384.0 | grad norm: 71060.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1045/ 159576 | consumed samples: 16720 | elapsed time per iteration (ms): 13604.4 | learning rate: 4.638E-06 | global batch size: 16 | lm loss: 7.362212E+00 | loss scale: 16384.0 | grad norm: 81169.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1046/ 159576 | consumed samples: 16736 | elapsed time per iteration (ms): 14075.1 | learning rate: 4.642E-06 | global batch size: 16 | lm loss: 7.450203E+00 | loss scale: 16384.0 | grad norm: 83510.606 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1047/ 159576 | consumed samples: 16752 | elapsed time per iteration (ms): 13677.3 | learning rate: 4.646E-06 | global batch size: 16 | lm loss: 7.554290E+00 | loss scale: 16384.0 | grad norm: 81988.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1048/ 159576 | consumed samples: 16768 | elapsed time per iteration (ms): 13606.4 | learning rate: 4.651E-06 | global batch size: 16 | lm loss: 7.327914E+00 | loss scale: 16384.0 | grad norm: 71618.221 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1049/ 159576 | consumed samples: 16784 | elapsed time per iteration (ms): 13669.1 | learning rate: 4.655E-06 | global batch size: 16 | lm loss: 7.596028E+00 | loss scale: 16384.0 | grad norm: 76665.796 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1050/ 159576 | consumed samples: 16800 | elapsed time per iteration (ms): 13708.7 | learning rate: 4.660E-06 | global batch size: 16 | lm loss: 7.326102E+00 | loss scale: 16384.0 | grad norm: 83331.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1051/ 159576 | consumed samples: 16816 | elapsed time per iteration (ms): 13981.1 | learning rate: 4.664E-06 | global batch size: 16 | lm loss: 7.619492E+00 | loss scale: 16384.0 | grad norm: 82397.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1052/ 159576 | consumed samples: 16832 | elapsed time per iteration (ms): 13516.4 | learning rate: 4.669E-06 | global batch size: 16 | lm loss: 7.530663E+00 | loss scale: 16384.0 | grad norm: 56319.745 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1053/ 159576 | consumed samples: 16848 | elapsed time per iteration (ms): 13647.6 | learning rate: 4.673E-06 | global batch size: 16 | lm loss: 7.443875E+00 | loss scale: 16384.0 | grad norm: 72562.436 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1054/ 159576 | consumed samples: 16864 | elapsed time per iteration (ms): 13627.5 | learning rate: 4.678E-06 | global batch size: 16 | lm loss: 7.479875E+00 | loss scale: 16384.0 | grad norm: 61495.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1055/ 159576 | consumed samples: 16880 | elapsed time per iteration (ms): 14065.0 | learning rate: 4.682E-06 | global batch size: 16 | lm loss: 7.612121E+00 | loss scale: 16384.0 | grad norm: 112310.814 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1056/ 159576 | consumed samples: 16896 | elapsed time per iteration (ms): 13707.4 | learning rate: 4.686E-06 | global batch size: 16 | lm loss: 7.408166E+00 | loss scale: 16384.0 | grad norm: 92018.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1057/ 159576 | consumed samples: 16912 | elapsed time per iteration (ms): 13656.1 | learning rate: 4.691E-06 | global batch size: 16 | lm loss: 7.422934E+00 | loss scale: 16384.0 | grad norm: 67279.309 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1058/ 159576 | consumed samples: 16928 | elapsed time per iteration (ms): 13676.8 | learning rate: 4.695E-06 | global batch size: 16 | lm loss: 7.397638E+00 | loss scale: 16384.0 | grad norm: 87601.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1059/ 159576 | consumed samples: 16944 | elapsed time per iteration (ms): 14053.0 | learning rate: 4.700E-06 | global batch size: 16 | lm loss: 7.514566E+00 | loss scale: 16384.0 | grad norm: 115639.831 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1060/ 159576 | consumed samples: 16960 | elapsed time per iteration (ms): 13722.6 | learning rate: 4.704E-06 | global batch size: 16 | lm loss: 7.310302E+00 | loss scale: 16384.0 | grad norm: 142865.091 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1061/ 159576 | consumed samples: 16976 | elapsed time per iteration (ms): 13679.9 | learning rate: 4.709E-06 | global batch size: 16 | lm loss: 7.399222E+00 | loss scale: 16384.0 | grad norm: 100646.221 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1062/ 159576 | consumed samples: 16992 | elapsed time per iteration (ms): 13634.5 | learning rate: 4.713E-06 | global batch size: 16 | lm loss: 7.332808E+00 | loss scale: 16384.0 | grad norm: 66218.286 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1063/ 159576 | consumed samples: 17008 | elapsed time per iteration (ms): 13663.6 | learning rate: 4.717E-06 | global batch size: 16 | lm loss: 7.490856E+00 | loss scale: 16384.0 | grad norm: 127442.068 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1064/ 159576 | consumed samples: 17024 | elapsed time per iteration (ms): 13909.0 | learning rate: 4.722E-06 | global batch size: 16 | lm loss: 7.693977E+00 | loss scale: 16384.0 | grad norm: 101533.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1065/ 159576 | consumed samples: 17040 | elapsed time per iteration (ms): 13658.8 | learning rate: 4.726E-06 | global batch size: 16 | lm loss: 7.565272E+00 | loss scale: 16384.0 | grad norm: 87035.171 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1066/ 159576 | consumed samples: 17056 | elapsed time per iteration (ms): 13679.2 | learning rate: 4.731E-06 | global batch size: 16 | lm loss: 7.790638E+00 | loss scale: 16384.0 | grad norm: 86411.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1067/ 159576 | consumed samples: 17072 | elapsed time per iteration (ms): 13759.2 | learning rate: 4.735E-06 | global batch size: 16 | lm loss: 7.438931E+00 | loss scale: 16384.0 | grad norm: 65756.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1068/ 159576 | consumed samples: 17088 | elapsed time per iteration (ms): 14138.1 | learning rate: 4.740E-06 | global batch size: 16 | lm loss: 7.361547E+00 | loss scale: 16384.0 | grad norm: 130711.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1069/ 159576 | consumed samples: 17104 | elapsed time per iteration (ms): 13687.8 | learning rate: 4.744E-06 | global batch size: 16 | lm loss: 7.413251E+00 | loss scale: 16384.0 | grad norm: 58324.579 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1070/ 159576 | consumed samples: 17120 | elapsed time per iteration (ms): 13637.9 | learning rate: 4.749E-06 | global batch size: 16 | lm loss: 7.397507E+00 | loss scale: 16384.0 | grad norm: 89260.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1071/ 159576 | consumed samples: 17136 | elapsed time per iteration (ms): 13680.2 | learning rate: 4.753E-06 | global batch size: 16 | lm loss: 7.535676E+00 | loss scale: 16384.0 | grad norm: 74408.995 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1072/ 159576 | consumed samples: 17152 | elapsed time per iteration (ms): 14062.2 | learning rate: 4.757E-06 | global batch size: 16 | lm loss: 7.411667E+00 | loss scale: 16384.0 | grad norm: 77225.681 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1073/ 159576 | consumed samples: 17168 | elapsed time per iteration (ms): 13681.2 | learning rate: 4.762E-06 | global batch size: 16 | lm loss: 7.394706E+00 | loss scale: 16384.0 | grad norm: 78590.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1074/ 159576 | consumed samples: 17184 | elapsed time per iteration (ms): 13709.1 | learning rate: 4.766E-06 | global batch size: 16 | lm loss: 7.616404E+00 | loss scale: 16384.0 | grad norm: 82722.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1075/ 159576 | consumed samples: 17200 | elapsed time per iteration (ms): 13743.2 | learning rate: 4.771E-06 | global batch size: 16 | lm loss: 7.395072E+00 | loss scale: 16384.0 | grad norm: 63549.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1076/ 159576 | consumed samples: 17216 | elapsed time per iteration (ms): 13619.1 | learning rate: 4.775E-06 | global batch size: 16 | lm loss: 7.593513E+00 | loss scale: 16384.0 | grad norm: 100985.259 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1077/ 159576 | consumed samples: 17232 | elapsed time per iteration (ms): 13859.6 | learning rate: 4.780E-06 | global batch size: 16 | lm loss: 7.379070E+00 | loss scale: 16384.0 | grad norm: 56935.671 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1078/ 159576 | consumed samples: 17248 | elapsed time per iteration (ms): 13589.7 | learning rate: 4.784E-06 | global batch size: 16 | lm loss: 7.412032E+00 | loss scale: 16384.0 | grad norm: 93391.483 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1079/ 159576 | consumed samples: 17264 | elapsed time per iteration (ms): 13575.0 | learning rate: 4.788E-06 | global batch size: 16 | lm loss: 7.485137E+00 | loss scale: 16384.0 | grad norm: 70759.989 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1080/ 159576 | consumed samples: 17280 | elapsed time per iteration (ms): 13590.9 | learning rate: 4.793E-06 | global batch size: 16 | lm loss: 7.410018E+00 | loss scale: 16384.0 | grad norm: 108070.843 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1081/ 159576 | consumed samples: 17296 | elapsed time per iteration (ms): 13934.8 | learning rate: 4.797E-06 | global batch size: 16 | lm loss: 7.444709E+00 | loss scale: 16384.0 | grad norm: 93912.071 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1082/ 159576 | consumed samples: 17312 | elapsed time per iteration (ms): 13598.4 | learning rate: 4.802E-06 | global batch size: 16 | lm loss: 7.532929E+00 | loss scale: 16384.0 | grad norm: 76683.978 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1083/ 159576 | consumed samples: 17328 | elapsed time per iteration (ms): 13510.5 | learning rate: 4.806E-06 | global batch size: 16 | lm loss: 7.599612E+00 | loss scale: 16384.0 | grad norm: 83858.264 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1084/ 159576 | consumed samples: 17344 | elapsed time per iteration (ms): 13542.7 | learning rate: 4.811E-06 | global batch size: 16 | lm loss: 7.387773E+00 | loss scale: 16384.0 | grad norm: 63120.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1085/ 159576 | consumed samples: 17360 | elapsed time per iteration (ms): 13555.5 | learning rate: 4.815E-06 | global batch size: 16 | lm loss: 7.289794E+00 | loss scale: 16384.0 | grad norm: 77022.669 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1086/ 159576 | consumed samples: 17376 | elapsed time per iteration (ms): 13932.5 | learning rate: 4.820E-06 | global batch size: 16 | lm loss: 7.393349E+00 | loss scale: 16384.0 | grad norm: 79433.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1087/ 159576 | consumed samples: 17392 | elapsed time per iteration (ms): 13479.9 | learning rate: 4.824E-06 | global batch size: 16 | lm loss: 7.321753E+00 | loss scale: 16384.0 | grad norm: 68970.976 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1088/ 159576 | consumed samples: 17408 | elapsed time per iteration (ms): 13681.0 | learning rate: 4.828E-06 | global batch size: 16 | lm loss: 7.320374E+00 | loss scale: 16384.0 | grad norm: 73549.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1089/ 159576 | consumed samples: 17424 | elapsed time per iteration (ms): 13654.0 | learning rate: 4.833E-06 | global batch size: 16 | lm loss: 7.605762E+00 | loss scale: 16384.0 | grad norm: 80374.482 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1090/ 159576 | consumed samples: 17440 | elapsed time per iteration (ms): 14059.3 | learning rate: 4.837E-06 | global batch size: 16 | lm loss: 7.631133E+00 | loss scale: 16384.0 | grad norm: 82954.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1091/ 159576 | consumed samples: 17456 | elapsed time per iteration (ms): 13724.8 | learning rate: 4.842E-06 | global batch size: 16 | lm loss: 7.507143E+00 | loss scale: 16384.0 | grad norm: 60066.048 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1092/ 159576 | consumed samples: 17472 | elapsed time per iteration (ms): 13461.4 | learning rate: 4.846E-06 | global batch size: 16 | lm loss: 7.300464E+00 | loss scale: 16384.0 | grad norm: 116487.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1093/ 159576 | consumed samples: 17488 | elapsed time per iteration (ms): 13525.0 | learning rate: 4.851E-06 | global batch size: 16 | lm loss: 7.388405E+00 | loss scale: 16384.0 | grad norm: 79147.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1094/ 159576 | consumed samples: 17504 | elapsed time per iteration (ms): 13950.4 | learning rate: 4.855E-06 | global batch size: 16 | lm loss: 7.471725E+00 | loss scale: 16384.0 | grad norm: 90987.897 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1095/ 159576 | consumed samples: 17520 | elapsed time per iteration (ms): 13624.6 | learning rate: 4.859E-06 | global batch size: 16 | lm loss: 7.530853E+00 | loss scale: 16384.0 | grad norm: 90057.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1096/ 159576 | consumed samples: 17536 | elapsed time per iteration (ms): 13591.9 | learning rate: 4.864E-06 | global batch size: 16 | lm loss: 7.420722E+00 | loss scale: 16384.0 | grad norm: 76037.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1097/ 159576 | consumed samples: 17552 | elapsed time per iteration (ms): 13587.0 | learning rate: 4.868E-06 | global batch size: 16 | lm loss: 7.363769E+00 | loss scale: 16384.0 | grad norm: 107388.359 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1098/ 159576 | consumed samples: 17568 | elapsed time per iteration (ms): 13667.8 | learning rate: 4.873E-06 | global batch size: 16 | lm loss: 7.310038E+00 | loss scale: 16384.0 | grad norm: 72408.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1099/ 159576 | consumed samples: 17584 | elapsed time per iteration (ms): 13707.4 | learning rate: 4.877E-06 | global batch size: 16 | lm loss: 7.291698E+00 | loss scale: 16384.0 | grad norm: 69292.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1100/ 159576 | consumed samples: 17600 | elapsed time per iteration (ms): 13564.5 | learning rate: 4.882E-06 | global batch size: 16 | lm loss: 7.713614E+00 | loss scale: 16384.0 | grad norm: 87150.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1101/ 159576 | consumed samples: 17616 | elapsed time per iteration (ms): 13621.9 | learning rate: 4.886E-06 | global batch size: 16 | lm loss: 7.482057E+00 | loss scale: 16384.0 | grad norm: 61713.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1102/ 159576 | consumed samples: 17632 | elapsed time per iteration (ms): 13628.2 | learning rate: 4.891E-06 | global batch size: 16 | lm loss: 7.370234E+00 | loss scale: 16384.0 | grad norm: 83708.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1103/ 159576 | consumed samples: 17648 | elapsed time per iteration (ms): 13962.7 | learning rate: 4.895E-06 | global batch size: 16 | lm loss: 7.373138E+00 | loss scale: 16384.0 | grad norm: 75905.969 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1104/ 159576 | consumed samples: 17664 | elapsed time per iteration (ms): 13627.3 | learning rate: 4.899E-06 | global batch size: 16 | lm loss: 7.448909E+00 | loss scale: 16384.0 | grad norm: 135141.473 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1105/ 159576 | consumed samples: 17680 | elapsed time per iteration (ms): 13640.6 | learning rate: 4.904E-06 | global batch size: 16 | lm loss: 7.252520E+00 | loss scale: 16384.0 | grad norm: 73661.038 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1106/ 159576 | consumed samples: 17696 | elapsed time per iteration (ms): 13666.3 | learning rate: 4.908E-06 | global batch size: 16 | lm loss: 7.507257E+00 | loss scale: 16384.0 | grad norm: 108098.635 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1107/ 159576 | consumed samples: 17712 | elapsed time per iteration (ms): 13849.3 | learning rate: 4.913E-06 | global batch size: 16 | lm loss: 7.429738E+00 | loss scale: 16384.0 | grad norm: 99851.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1108/ 159576 | consumed samples: 17728 | elapsed time per iteration (ms): 13862.9 | learning rate: 4.917E-06 | global batch size: 16 | lm loss: 7.422798E+00 | loss scale: 16384.0 | grad norm: 90788.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1109/ 159576 | consumed samples: 17744 | elapsed time per iteration (ms): 13640.2 | learning rate: 4.922E-06 | global batch size: 16 | lm loss: 7.656183E+00 | loss scale: 16384.0 | grad norm: 204462.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1110/ 159576 | consumed samples: 17760 | elapsed time per iteration (ms): 13627.0 | learning rate: 4.926E-06 | global batch size: 16 | lm loss: 7.576304E+00 | loss scale: 16384.0 | grad norm: 166002.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1111/ 159576 | consumed samples: 17776 | elapsed time per iteration (ms): 13632.9 | learning rate: 4.930E-06 | global batch size: 16 | lm loss: 7.626440E+00 | loss scale: 16384.0 | grad norm: 82466.643 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1112/ 159576 | consumed samples: 17792 | elapsed time per iteration (ms): 13939.0 | learning rate: 4.935E-06 | global batch size: 16 | lm loss: 7.302793E+00 | loss scale: 16384.0 | grad norm: 150100.520 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1113/ 159576 | consumed samples: 17808 | elapsed time per iteration (ms): 13640.4 | learning rate: 4.939E-06 | global batch size: 16 | lm loss: 7.493092E+00 | loss scale: 16384.0 | grad norm: 104956.045 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1114/ 159576 | consumed samples: 17824 | elapsed time per iteration (ms): 13637.6 | learning rate: 4.944E-06 | global batch size: 16 | lm loss: 7.475542E+00 | loss scale: 16384.0 | grad norm: 86316.213 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1115/ 159576 | consumed samples: 17840 | elapsed time per iteration (ms): 13630.5 | learning rate: 4.948E-06 | global batch size: 16 | lm loss: 7.367518E+00 | loss scale: 16384.0 | grad norm: 127229.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1116/ 159576 | consumed samples: 17856 | elapsed time per iteration (ms): 13929.1 | learning rate: 4.953E-06 | global batch size: 16 | lm loss: 7.463512E+00 | loss scale: 16384.0 | grad norm: 80765.100 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1117/ 159576 | consumed samples: 17872 | elapsed time per iteration (ms): 13651.9 | learning rate: 4.957E-06 | global batch size: 16 | lm loss: 7.389682E+00 | loss scale: 16384.0 | grad norm: 114274.057 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1118/ 159576 | consumed samples: 17888 | elapsed time per iteration (ms): 13673.8 | learning rate: 4.962E-06 | global batch size: 16 | lm loss: 7.446970E+00 | loss scale: 16384.0 | grad norm: 93011.728 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1119/ 159576 | consumed samples: 17904 | elapsed time per iteration (ms): 13700.2 | learning rate: 4.966E-06 | global batch size: 16 | lm loss: 7.314221E+00 | loss scale: 16384.0 | grad norm: 105575.833 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1120/ 159576 | consumed samples: 17920 | elapsed time per iteration (ms): 13702.7 | learning rate: 4.970E-06 | global batch size: 16 | lm loss: 7.372279E+00 | loss scale: 16384.0 | grad norm: 77507.701 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1121/ 159576 | consumed samples: 17936 | elapsed time per iteration (ms): 13869.6 | learning rate: 4.975E-06 | global batch size: 16 | lm loss: 7.535093E+00 | loss scale: 16384.0 | grad norm: 98620.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1122/ 159576 | consumed samples: 17952 | elapsed time per iteration (ms): 13679.6 | learning rate: 4.979E-06 | global batch size: 16 | lm loss: 8.079200E+00 | loss scale: 16384.0 | grad norm: 187332.489 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1123/ 159576 | consumed samples: 17968 | elapsed time per iteration (ms): 13672.8 | learning rate: 4.984E-06 | global batch size: 16 | lm loss: 7.433456E+00 | loss scale: 16384.0 | grad norm: 139834.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1124/ 159576 | consumed samples: 17984 | elapsed time per iteration (ms): 13651.7 | learning rate: 4.988E-06 | global batch size: 16 | lm loss: 7.440439E+00 | loss scale: 16384.0 | grad norm: 91486.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1125/ 159576 | consumed samples: 18000 | elapsed time per iteration (ms): 14085.1 | learning rate: 4.993E-06 | global batch size: 16 | lm loss: 7.453449E+00 | loss scale: 16384.0 | grad norm: 170685.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1126/ 159576 | consumed samples: 18016 | elapsed time per iteration (ms): 13744.0 | learning rate: 4.997E-06 | global batch size: 16 | lm loss: 7.544756E+00 | loss scale: 16384.0 | grad norm: 93482.948 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1127/ 159576 | consumed samples: 18032 | elapsed time per iteration (ms): 13666.9 | learning rate: 5.001E-06 | global batch size: 16 | lm loss: 7.435877E+00 | loss scale: 16384.0 | grad norm: 98259.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1128/ 159576 | consumed samples: 18048 | elapsed time per iteration (ms): 13692.7 | learning rate: 5.006E-06 | global batch size: 16 | lm loss: 7.496342E+00 | loss scale: 16384.0 | grad norm: 130279.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1129/ 159576 | consumed samples: 18064 | elapsed time per iteration (ms): 14100.4 | learning rate: 5.010E-06 | global batch size: 16 | lm loss: 7.501980E+00 | loss scale: 16384.0 | grad norm: 88561.836 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1130/ 159576 | consumed samples: 18080 | elapsed time per iteration (ms): 13620.7 | learning rate: 5.015E-06 | global batch size: 16 | lm loss: 7.470133E+00 | loss scale: 16384.0 | grad norm: 155289.997 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1131/ 159576 | consumed samples: 18096 | elapsed time per iteration (ms): 13683.0 | learning rate: 5.019E-06 | global batch size: 16 | lm loss: 7.539918E+00 | loss scale: 16384.0 | grad norm: 89135.032 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1132/ 159576 | consumed samples: 18112 | elapsed time per iteration (ms): 13643.2 | learning rate: 5.024E-06 | global batch size: 16 | lm loss: 7.537309E+00 | loss scale: 16384.0 | grad norm: 83460.414 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1133/ 159576 | consumed samples: 18128 | elapsed time per iteration (ms): 13758.8 | learning rate: 5.028E-06 | global batch size: 16 | lm loss: 7.445082E+00 | loss scale: 16384.0 | grad norm: 97599.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1134/ 159576 | consumed samples: 18144 | elapsed time per iteration (ms): 13842.3 | learning rate: 5.033E-06 | global batch size: 16 | lm loss: 7.533705E+00 | loss scale: 16384.0 | grad norm: 153106.257 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1135/ 159576 | consumed samples: 18160 | elapsed time per iteration (ms): 13641.3 | learning rate: 5.037E-06 | global batch size: 16 | lm loss: 7.351761E+00 | loss scale: 16384.0 | grad norm: 139552.025 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1136/ 159576 | consumed samples: 18176 | elapsed time per iteration (ms): 13757.6 | learning rate: 5.041E-06 | global batch size: 16 | lm loss: 7.386802E+00 | loss scale: 16384.0 | grad norm: 82271.014 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1137/ 159576 | consumed samples: 18192 | elapsed time per iteration (ms): 13590.7 | learning rate: 5.046E-06 | global batch size: 16 | lm loss: 7.276345E+00 | loss scale: 16384.0 | grad norm: 139306.896 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1138/ 159576 | consumed samples: 18208 | elapsed time per iteration (ms): 14099.6 | learning rate: 5.050E-06 | global batch size: 16 | lm loss: 7.489694E+00 | loss scale: 16384.0 | grad norm: 75568.533 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1139/ 159576 | consumed samples: 18224 | elapsed time per iteration (ms): 13765.0 | learning rate: 5.055E-06 | global batch size: 16 | lm loss: 6.968816E+00 | loss scale: 16384.0 | grad norm: 118020.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1140/ 159576 | consumed samples: 18240 | elapsed time per iteration (ms): 13662.4 | learning rate: 5.059E-06 | global batch size: 16 | lm loss: 7.446542E+00 | loss scale: 16384.0 | grad norm: 117497.431 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1141/ 159576 | consumed samples: 18256 | elapsed time per iteration (ms): 13747.0 | learning rate: 5.064E-06 | global batch size: 16 | lm loss: 7.328124E+00 | loss scale: 16384.0 | grad norm: 126653.284 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1142/ 159576 | consumed samples: 18272 | elapsed time per iteration (ms): 14086.2 | learning rate: 5.068E-06 | global batch size: 16 | lm loss: 7.359120E+00 | loss scale: 16384.0 | grad norm: 158587.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1143/ 159576 | consumed samples: 18288 | elapsed time per iteration (ms): 13785.6 | learning rate: 5.072E-06 | global batch size: 16 | lm loss: 7.289187E+00 | loss scale: 16384.0 | grad norm: 93193.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1144/ 159576 | consumed samples: 18304 | elapsed time per iteration (ms): 13650.1 | learning rate: 5.077E-06 | global batch size: 16 | lm loss: 7.541381E+00 | loss scale: 16384.0 | grad norm: 127276.458 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1145/ 159576 | consumed samples: 18320 | elapsed time per iteration (ms): 13673.3 | learning rate: 5.081E-06 | global batch size: 16 | lm loss: 7.343310E+00 | loss scale: 16384.0 | grad norm: 141086.682 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1146/ 159576 | consumed samples: 18336 | elapsed time per iteration (ms): 13709.3 | learning rate: 5.086E-06 | global batch size: 16 | lm loss: 7.291780E+00 | loss scale: 16384.0 | grad norm: 84706.443 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1147/ 159576 | consumed samples: 18352 | elapsed time per iteration (ms): 13798.7 | learning rate: 5.090E-06 | global batch size: 16 | lm loss: 7.395382E+00 | loss scale: 16384.0 | grad norm: 168181.547 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1148/ 159576 | consumed samples: 18368 | elapsed time per iteration (ms): 13678.3 | learning rate: 5.095E-06 | global batch size: 16 | lm loss: 7.287755E+00 | loss scale: 16384.0 | grad norm: 150595.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1149/ 159576 | consumed samples: 18384 | elapsed time per iteration (ms): 13705.6 | learning rate: 5.099E-06 | global batch size: 16 | lm loss: 7.521116E+00 | loss scale: 16384.0 | grad norm: 90594.393 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1150/ 159576 | consumed samples: 18400 | elapsed time per iteration (ms): 13724.2 | learning rate: 5.104E-06 | global batch size: 16 | lm loss: 7.560548E+00 | loss scale: 16384.0 | grad norm: 124093.174 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1151/ 159576 | consumed samples: 18416 | elapsed time per iteration (ms): 14011.4 | learning rate: 5.108E-06 | global batch size: 16 | lm loss: 7.334007E+00 | loss scale: 16384.0 | grad norm: 93590.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1152/ 159576 | consumed samples: 18432 | elapsed time per iteration (ms): 13638.1 | learning rate: 5.112E-06 | global batch size: 16 | lm loss: 7.340695E+00 | loss scale: 16384.0 | grad norm: 120515.541 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1153/ 159576 | consumed samples: 18448 | elapsed time per iteration (ms): 13670.9 | learning rate: 5.117E-06 | global batch size: 16 | lm loss: 7.310359E+00 | loss scale: 16384.0 | grad norm: 121580.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1154/ 159576 | consumed samples: 18464 | elapsed time per iteration (ms): 13692.4 | learning rate: 5.121E-06 | global batch size: 16 | lm loss: 7.407881E+00 | loss scale: 16384.0 | grad norm: 86210.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1155/ 159576 | consumed samples: 18480 | elapsed time per iteration (ms): 14124.7 | learning rate: 5.126E-06 | global batch size: 16 | lm loss: 7.533539E+00 | loss scale: 16384.0 | grad norm: 117499.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1156/ 159576 | consumed samples: 18496 | elapsed time per iteration (ms): 13713.9 | learning rate: 5.130E-06 | global batch size: 16 | lm loss: 7.454373E+00 | loss scale: 16384.0 | grad norm: 82164.881 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1157/ 159576 | consumed samples: 18512 | elapsed time per iteration (ms): 13665.0 | learning rate: 5.135E-06 | global batch size: 16 | lm loss: 6.997806E+00 | loss scale: 16384.0 | grad norm: 118291.842 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1158/ 159576 | consumed samples: 18528 | elapsed time per iteration (ms): 13620.7 | learning rate: 5.139E-06 | global batch size: 16 | lm loss: 7.155181E+00 | loss scale: 16384.0 | grad norm: 80841.378 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1159/ 159576 | consumed samples: 18544 | elapsed time per iteration (ms): 13522.0 | learning rate: 5.143E-06 | global batch size: 16 | lm loss: 7.303053E+00 | loss scale: 16384.0 | grad norm: 153692.954 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1160/ 159576 | consumed samples: 18560 | elapsed time per iteration (ms): 13934.6 | learning rate: 5.148E-06 | global batch size: 16 | lm loss: 7.453541E+00 | loss scale: 16384.0 | grad norm: 178564.006 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1161/ 159576 | consumed samples: 18576 | elapsed time per iteration (ms): 13591.1 | learning rate: 5.152E-06 | global batch size: 16 | lm loss: 7.370741E+00 | loss scale: 16384.0 | grad norm: 96828.834 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1162/ 159576 | consumed samples: 18592 | elapsed time per iteration (ms): 13610.9 | learning rate: 5.157E-06 | global batch size: 16 | lm loss: 7.395625E+00 | loss scale: 16384.0 | grad norm: 138531.373 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1163/ 159576 | consumed samples: 18608 | elapsed time per iteration (ms): 13633.4 | learning rate: 5.161E-06 | global batch size: 16 | lm loss: 7.721334E+00 | loss scale: 16384.0 | grad norm: 107198.076 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1164/ 159576 | consumed samples: 18624 | elapsed time per iteration (ms): 13919.7 | learning rate: 5.166E-06 | global batch size: 16 | lm loss: 7.418262E+00 | loss scale: 16384.0 | grad norm: 104593.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1165/ 159576 | consumed samples: 18640 | elapsed time per iteration (ms): 13699.8 | learning rate: 5.170E-06 | global batch size: 16 | lm loss: 7.388452E+00 | loss scale: 16384.0 | grad norm: 87922.625 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1166/ 159576 | consumed samples: 18656 | elapsed time per iteration (ms): 13567.0 | learning rate: 5.175E-06 | global batch size: 16 | lm loss: 7.359789E+00 | loss scale: 16384.0 | grad norm: 167490.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1167/ 159576 | consumed samples: 18672 | elapsed time per iteration (ms): 13665.3 | learning rate: 5.179E-06 | global batch size: 16 | lm loss: 7.513920E+00 | loss scale: 16384.0 | grad norm: 187148.881 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1168/ 159576 | consumed samples: 18688 | elapsed time per iteration (ms): 13712.9 | learning rate: 5.183E-06 | global batch size: 16 | lm loss: 7.333634E+00 | loss scale: 16384.0 | grad norm: 80524.927 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1169/ 159576 | consumed samples: 18704 | elapsed time per iteration (ms): 13807.4 | learning rate: 5.188E-06 | global batch size: 16 | lm loss: 7.551642E+00 | loss scale: 16384.0 | grad norm: 96715.430 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1170/ 159576 | consumed samples: 18720 | elapsed time per iteration (ms): 13672.0 | learning rate: 5.192E-06 | global batch size: 16 | lm loss: 7.354926E+00 | loss scale: 16384.0 | grad norm: 108931.618 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1171/ 159576 | consumed samples: 18736 | elapsed time per iteration (ms): 13735.2 | learning rate: 5.197E-06 | global batch size: 16 | lm loss: 7.360828E+00 | loss scale: 16384.0 | grad norm: 93043.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1172/ 159576 | consumed samples: 18752 | elapsed time per iteration (ms): 13717.8 | learning rate: 5.201E-06 | global batch size: 16 | lm loss: 7.538117E+00 | loss scale: 16384.0 | grad norm: 318365.891 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1173/ 159576 | consumed samples: 18768 | elapsed time per iteration (ms): 13883.3 | learning rate: 5.206E-06 | global batch size: 16 | lm loss: 7.601986E+00 | loss scale: 16384.0 | grad norm: 139775.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1174/ 159576 | consumed samples: 18784 | elapsed time per iteration (ms): 13707.5 | learning rate: 5.210E-06 | global batch size: 16 | lm loss: 7.492588E+00 | loss scale: 16384.0 | grad norm: 90689.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1175/ 159576 | consumed samples: 18800 | elapsed time per iteration (ms): 13678.7 | learning rate: 5.214E-06 | global batch size: 16 | lm loss: 7.586353E+00 | loss scale: 16384.0 | grad norm: 123587.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1176/ 159576 | consumed samples: 18816 | elapsed time per iteration (ms): 13643.8 | learning rate: 5.219E-06 | global batch size: 16 | lm loss: 7.585982E+00 | loss scale: 16384.0 | grad norm: 134121.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1177/ 159576 | consumed samples: 18832 | elapsed time per iteration (ms): 13876.6 | learning rate: 5.223E-06 | global batch size: 16 | lm loss: 7.290177E+00 | loss scale: 16384.0 | grad norm: 61795.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1178/ 159576 | consumed samples: 18848 | elapsed time per iteration (ms): 13887.6 | learning rate: 5.228E-06 | global batch size: 16 | lm loss: 7.394442E+00 | loss scale: 16384.0 | grad norm: 214580.050 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1179/ 159576 | consumed samples: 18864 | elapsed time per iteration (ms): 13671.2 | learning rate: 5.232E-06 | global batch size: 16 | lm loss: 7.342830E+00 | loss scale: 16384.0 | grad norm: 170377.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1180/ 159576 | consumed samples: 18880 | elapsed time per iteration (ms): 13615.6 | learning rate: 5.237E-06 | global batch size: 16 | lm loss: 7.353875E+00 | loss scale: 16384.0 | grad norm: 98364.101 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1181/ 159576 | consumed samples: 18896 | elapsed time per iteration (ms): 13659.2 | learning rate: 5.241E-06 | global batch size: 16 | lm loss: 7.310112E+00 | loss scale: 16384.0 | grad norm: 153347.882 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1182/ 159576 | consumed samples: 18912 | elapsed time per iteration (ms): 13718.2 | learning rate: 5.246E-06 | global batch size: 16 | lm loss: 7.516181E+00 | loss scale: 16384.0 | grad norm: 183425.509 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1183/ 159576 | consumed samples: 18928 | elapsed time per iteration (ms): 13614.7 | learning rate: 5.250E-06 | global batch size: 16 | lm loss: 7.284205E+00 | loss scale: 16384.0 | grad norm: 116539.767 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1184/ 159576 | consumed samples: 18944 | elapsed time per iteration (ms): 13636.1 | learning rate: 5.254E-06 | global batch size: 16 | lm loss: 7.392292E+00 | loss scale: 16384.0 | grad norm: 167498.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1185/ 159576 | consumed samples: 18960 | elapsed time per iteration (ms): 13633.9 | learning rate: 5.259E-06 | global batch size: 16 | lm loss: 7.250909E+00 | loss scale: 16384.0 | grad norm: 100955.402 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1186/ 159576 | consumed samples: 18976 | elapsed time per iteration (ms): 13999.4 | learning rate: 5.263E-06 | global batch size: 16 | lm loss: 7.536862E+00 | loss scale: 16384.0 | grad norm: 100050.160 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1187/ 159576 | consumed samples: 18992 | elapsed time per iteration (ms): 13653.6 | learning rate: 5.268E-06 | global batch size: 16 | lm loss: 7.565104E+00 | loss scale: 16384.0 | grad norm: 118619.018 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1188/ 159576 | consumed samples: 19008 | elapsed time per iteration (ms): 13606.5 | learning rate: 5.272E-06 | global batch size: 16 | lm loss: 7.258739E+00 | loss scale: 16384.0 | grad norm: 126790.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1189/ 159576 | consumed samples: 19024 | elapsed time per iteration (ms): 13571.9 | learning rate: 5.277E-06 | global batch size: 16 | lm loss: 7.184493E+00 | loss scale: 16384.0 | grad norm: 84818.036 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1190/ 159576 | consumed samples: 19040 | elapsed time per iteration (ms): 13962.3 | learning rate: 5.281E-06 | global batch size: 16 | lm loss: 7.209998E+00 | loss scale: 16384.0 | grad norm: 131280.260 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1191/ 159576 | consumed samples: 19056 | elapsed time per iteration (ms): 13770.8 | learning rate: 5.286E-06 | global batch size: 16 | lm loss: 7.406217E+00 | loss scale: 16384.0 | grad norm: 110178.484 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1192/ 159576 | consumed samples: 19072 | elapsed time per iteration (ms): 13665.3 | learning rate: 5.290E-06 | global batch size: 16 | lm loss: 7.350411E+00 | loss scale: 16384.0 | grad norm: 81228.032 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1193/ 159576 | consumed samples: 19088 | elapsed time per iteration (ms): 13585.9 | learning rate: 5.294E-06 | global batch size: 16 | lm loss: 7.583058E+00 | loss scale: 16384.0 | grad norm: 291080.363 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1194/ 159576 | consumed samples: 19104 | elapsed time per iteration (ms): 13658.0 | learning rate: 5.299E-06 | global batch size: 16 | lm loss: 7.808938E+00 | loss scale: 16384.0 | grad norm: 193632.364 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1195/ 159576 | consumed samples: 19120 | elapsed time per iteration (ms): 13777.0 | learning rate: 5.303E-06 | global batch size: 16 | lm loss: 7.459247E+00 | loss scale: 16384.0 | grad norm: 100738.405 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1196/ 159576 | consumed samples: 19136 | elapsed time per iteration (ms): 13624.3 | learning rate: 5.308E-06 | global batch size: 16 | lm loss: 7.240894E+00 | loss scale: 16384.0 | grad norm: 102223.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1197/ 159576 | consumed samples: 19152 | elapsed time per iteration (ms): 13630.2 | learning rate: 5.312E-06 | global batch size: 16 | lm loss: 7.469604E+00 | loss scale: 16384.0 | grad norm: 91547.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1198/ 159576 | consumed samples: 19168 | elapsed time per iteration (ms): 13603.4 | learning rate: 5.317E-06 | global batch size: 16 | lm loss: 7.399169E+00 | loss scale: 16384.0 | grad norm: 246196.581 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1199/ 159576 | consumed samples: 19184 | elapsed time per iteration (ms): 14028.5 | learning rate: 5.321E-06 | global batch size: 16 | lm loss: 7.465099E+00 | loss scale: 16384.0 | grad norm: 185665.583 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1200/ 159576 | consumed samples: 19200 | elapsed time per iteration (ms): 13601.1 | learning rate: 5.325E-06 | global batch size: 16 | lm loss: 7.383169E+00 | loss scale: 16384.0 | grad norm: 115872.720 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1201/ 159576 | consumed samples: 19216 | elapsed time per iteration (ms): 13566.6 | learning rate: 5.330E-06 | global batch size: 16 | lm loss: 7.352910E+00 | loss scale: 16384.0 | grad norm: 114834.353 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1202/ 159576 | consumed samples: 19232 | elapsed time per iteration (ms): 13557.4 | learning rate: 5.334E-06 | global batch size: 16 | lm loss: 7.521720E+00 | loss scale: 16384.0 | grad norm: 101976.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1203/ 159576 | consumed samples: 19248 | elapsed time per iteration (ms): 13525.0 | learning rate: 5.339E-06 | global batch size: 16 | lm loss: 7.225696E+00 | loss scale: 16384.0 | grad norm: 178745.243 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1204/ 159576 | consumed samples: 19264 | elapsed time per iteration (ms): 13539.3 | learning rate: 5.343E-06 | global batch size: 16 | lm loss: 7.375963E+00 | loss scale: 16384.0 | grad norm: 175723.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1205/ 159576 | consumed samples: 19280 | elapsed time per iteration (ms): 13532.3 | learning rate: 5.348E-06 | global batch size: 16 | lm loss: 7.402988E+00 | loss scale: 16384.0 | grad norm: 104645.448 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1206/ 159576 | consumed samples: 19296 | elapsed time per iteration (ms): 13502.9 | learning rate: 5.352E-06 | global batch size: 16 | lm loss: 7.302839E+00 | loss scale: 16384.0 | grad norm: 99328.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1207/ 159576 | consumed samples: 19312 | elapsed time per iteration (ms): 13540.4 | learning rate: 5.357E-06 | global batch size: 16 | lm loss: 7.555269E+00 | loss scale: 16384.0 | grad norm: 89166.858 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1208/ 159576 | consumed samples: 19328 | elapsed time per iteration (ms): 13900.0 | learning rate: 5.361E-06 | global batch size: 16 | lm loss: 7.459805E+00 | loss scale: 16384.0 | grad norm: 135152.393 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1209/ 159576 | consumed samples: 19344 | elapsed time per iteration (ms): 13560.6 | learning rate: 5.365E-06 | global batch size: 16 | lm loss: 7.419579E+00 | loss scale: 16384.0 | grad norm: 101249.512 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1210/ 159576 | consumed samples: 19360 | elapsed time per iteration (ms): 13658.8 | learning rate: 5.370E-06 | global batch size: 16 | lm loss: 7.348646E+00 | loss scale: 16384.0 | grad norm: 104483.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1211/ 159576 | consumed samples: 19376 | elapsed time per iteration (ms): 13533.6 | learning rate: 5.374E-06 | global batch size: 16 | lm loss: 7.494230E+00 | loss scale: 16384.0 | grad norm: 110210.437 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1212/ 159576 | consumed samples: 19392 | elapsed time per iteration (ms): 13905.0 | learning rate: 5.379E-06 | global batch size: 16 | lm loss: 7.390188E+00 | loss scale: 16384.0 | grad norm: 96645.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1213/ 159576 | consumed samples: 19408 | elapsed time per iteration (ms): 13673.2 | learning rate: 5.383E-06 | global batch size: 16 | lm loss: 7.318599E+00 | loss scale: 16384.0 | grad norm: 166216.352 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1214/ 159576 | consumed samples: 19424 | elapsed time per iteration (ms): 13582.9 | learning rate: 5.388E-06 | global batch size: 16 | lm loss: 7.262068E+00 | loss scale: 16384.0 | grad norm: 75724.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1215/ 159576 | consumed samples: 19440 | elapsed time per iteration (ms): 13570.1 | learning rate: 5.392E-06 | global batch size: 16 | lm loss: 7.594563E+00 | loss scale: 16384.0 | grad norm: 95306.819 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1216/ 159576 | consumed samples: 19456 | elapsed time per iteration (ms): 13639.7 | learning rate: 5.396E-06 | global batch size: 16 | lm loss: 7.375734E+00 | loss scale: 16384.0 | grad norm: 86152.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1217/ 159576 | consumed samples: 19472 | elapsed time per iteration (ms): 14091.6 | learning rate: 5.401E-06 | global batch size: 16 | lm loss: 7.213047E+00 | loss scale: 16384.0 | grad norm: 95583.311 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1218/ 159576 | consumed samples: 19488 | elapsed time per iteration (ms): 13516.3 | learning rate: 5.405E-06 | global batch size: 16 | lm loss: 7.437682E+00 | loss scale: 16384.0 | grad norm: 221549.634 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1219/ 159576 | consumed samples: 19504 | elapsed time per iteration (ms): 13610.0 | learning rate: 5.410E-06 | global batch size: 16 | lm loss: 7.254605E+00 | loss scale: 16384.0 | grad norm: 97554.516 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1220/ 159576 | consumed samples: 19520 | elapsed time per iteration (ms): 13565.5 | learning rate: 5.414E-06 | global batch size: 16 | lm loss: 7.248229E+00 | loss scale: 16384.0 | grad norm: 89138.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1221/ 159576 | consumed samples: 19536 | elapsed time per iteration (ms): 13989.3 | learning rate: 5.419E-06 | global batch size: 16 | lm loss: 7.313151E+00 | loss scale: 16384.0 | grad norm: 172651.828 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1222/ 159576 | consumed samples: 19552 | elapsed time per iteration (ms): 13602.4 | learning rate: 5.423E-06 | global batch size: 16 | lm loss: 7.476789E+00 | loss scale: 16384.0 | grad norm: 67387.822 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1223/ 159576 | consumed samples: 19568 | elapsed time per iteration (ms): 13656.0 | learning rate: 5.428E-06 | global batch size: 16 | lm loss: 7.289939E+00 | loss scale: 16384.0 | grad norm: 207125.248 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1224/ 159576 | consumed samples: 19584 | elapsed time per iteration (ms): 13537.8 | learning rate: 5.432E-06 | global batch size: 16 | lm loss: 7.409894E+00 | loss scale: 16384.0 | grad norm: 156218.537 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1225/ 159576 | consumed samples: 19600 | elapsed time per iteration (ms): 13600.0 | learning rate: 5.436E-06 | global batch size: 16 | lm loss: 7.226832E+00 | loss scale: 16384.0 | grad norm: 93258.536 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1226/ 159576 | consumed samples: 19616 | elapsed time per iteration (ms): 13778.7 | learning rate: 5.441E-06 | global batch size: 16 | lm loss: 7.406470E+00 | loss scale: 16384.0 | grad norm: 95037.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1227/ 159576 | consumed samples: 19632 | elapsed time per iteration (ms): 13609.5 | learning rate: 5.445E-06 | global batch size: 16 | lm loss: 7.385060E+00 | loss scale: 16384.0 | grad norm: 77831.367 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1228/ 159576 | consumed samples: 19648 | elapsed time per iteration (ms): 13561.8 | learning rate: 5.450E-06 | global batch size: 16 | lm loss: 7.283795E+00 | loss scale: 16384.0 | grad norm: 219813.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1229/ 159576 | consumed samples: 19664 | elapsed time per iteration (ms): 13619.4 | learning rate: 5.454E-06 | global batch size: 16 | lm loss: 7.344219E+00 | loss scale: 16384.0 | grad norm: 122192.335 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1230/ 159576 | consumed samples: 19680 | elapsed time per iteration (ms): 14054.6 | learning rate: 5.459E-06 | global batch size: 16 | lm loss: 7.364305E+00 | loss scale: 16384.0 | grad norm: 90944.731 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1231/ 159576 | consumed samples: 19696 | elapsed time per iteration (ms): 13589.9 | learning rate: 5.463E-06 | global batch size: 16 | lm loss: 7.421730E+00 | loss scale: 16384.0 | grad norm: 178816.259 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1232/ 159576 | consumed samples: 19712 | elapsed time per iteration (ms): 13624.6 | learning rate: 5.467E-06 | global batch size: 16 | lm loss: 7.278720E+00 | loss scale: 16384.0 | grad norm: 101190.498 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1233/ 159576 | consumed samples: 19728 | elapsed time per iteration (ms): 13574.7 | learning rate: 5.472E-06 | global batch size: 16 | lm loss: 7.525582E+00 | loss scale: 16384.0 | grad norm: 95476.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1234/ 159576 | consumed samples: 19744 | elapsed time per iteration (ms): 13981.0 | learning rate: 5.476E-06 | global batch size: 16 | lm loss: 7.294508E+00 | loss scale: 16384.0 | grad norm: 110379.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1235/ 159576 | consumed samples: 19760 | elapsed time per iteration (ms): 13641.1 | learning rate: 5.481E-06 | global batch size: 16 | lm loss: 7.431972E+00 | loss scale: 16384.0 | grad norm: 103188.497 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1236/ 159576 | consumed samples: 19776 | elapsed time per iteration (ms): 13575.4 | learning rate: 5.485E-06 | global batch size: 16 | lm loss: 7.397687E+00 | loss scale: 16384.0 | grad norm: 92125.975 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1237/ 159576 | consumed samples: 19792 | elapsed time per iteration (ms): 13672.0 | learning rate: 5.490E-06 | global batch size: 16 | lm loss: 7.314774E+00 | loss scale: 16384.0 | grad norm: 75870.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1238/ 159576 | consumed samples: 19808 | elapsed time per iteration (ms): 13509.4 | learning rate: 5.494E-06 | global batch size: 16 | lm loss: 7.187806E+00 | loss scale: 16384.0 | grad norm: 173296.806 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1239/ 159576 | consumed samples: 19824 | elapsed time per iteration (ms): 13875.3 | learning rate: 5.499E-06 | global batch size: 16 | lm loss: 7.376097E+00 | loss scale: 16384.0 | grad norm: 133632.906 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1240/ 159576 | consumed samples: 19840 | elapsed time per iteration (ms): 13610.1 | learning rate: 5.503E-06 | global batch size: 16 | lm loss: 7.267582E+00 | loss scale: 16384.0 | grad norm: 85104.985 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1241/ 159576 | consumed samples: 19856 | elapsed time per iteration (ms): 13551.5 | learning rate: 5.507E-06 | global batch size: 16 | lm loss: 7.352735E+00 | loss scale: 16384.0 | grad norm: 90699.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1242/ 159576 | consumed samples: 19872 | elapsed time per iteration (ms): 13593.9 | learning rate: 5.512E-06 | global batch size: 16 | lm loss: 7.468503E+00 | loss scale: 16384.0 | grad norm: 83188.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1243/ 159576 | consumed samples: 19888 | elapsed time per iteration (ms): 13930.9 | learning rate: 5.516E-06 | global batch size: 16 | lm loss: 7.214951E+00 | loss scale: 16384.0 | grad norm: 78366.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1244/ 159576 | consumed samples: 19904 | elapsed time per iteration (ms): 13652.1 | learning rate: 5.521E-06 | global batch size: 16 | lm loss: 7.260246E+00 | loss scale: 16384.0 | grad norm: 80928.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 07:03:47] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1162855_[2-10%1] on 'gpu_p13' partition) -[2021-09-24 07:03:47] PULSE: tr8-104B is running for 1:11:36 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 1245/ 159576 | consumed samples: 19920 | elapsed time per iteration (ms): 13521.2 | learning rate: 5.525E-06 | global batch size: 16 | lm loss: 7.539850E+00 | loss scale: 16384.0 | grad norm: 85379.198 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1246/ 159576 | consumed samples: 19936 | elapsed time per iteration (ms): 13540.5 | learning rate: 5.530E-06 | global batch size: 16 | lm loss: 7.541747E+00 | loss scale: 16384.0 | grad norm: 112594.519 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1247/ 159576 | consumed samples: 19952 | elapsed time per iteration (ms): 13599.8 | learning rate: 5.534E-06 | global batch size: 16 | lm loss: 7.427727E+00 | loss scale: 16384.0 | grad norm: 75830.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1248/ 159576 | consumed samples: 19968 | elapsed time per iteration (ms): 13827.8 | learning rate: 5.538E-06 | global batch size: 16 | lm loss: 7.407825E+00 | loss scale: 16384.0 | grad norm: 125194.168 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1249/ 159576 | consumed samples: 19984 | elapsed time per iteration (ms): 13505.2 | learning rate: 5.543E-06 | global batch size: 16 | lm loss: 7.566711E+00 | loss scale: 16384.0 | grad norm: 116825.251 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1250/ 159576 | consumed samples: 20000 | elapsed time per iteration (ms): 13584.6 | learning rate: 5.547E-06 | global batch size: 16 | lm loss: 7.156756E+00 | loss scale: 16384.0 | grad norm: 75875.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1251/ 159576 | consumed samples: 20016 | elapsed time per iteration (ms): 13599.4 | learning rate: 5.552E-06 | global batch size: 16 | lm loss: 7.355666E+00 | loss scale: 16384.0 | grad norm: 128516.877 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1252/ 159576 | consumed samples: 20032 | elapsed time per iteration (ms): 13882.6 | learning rate: 5.556E-06 | global batch size: 16 | lm loss: 7.339529E+00 | loss scale: 16384.0 | grad norm: 92000.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1253/ 159576 | consumed samples: 20048 | elapsed time per iteration (ms): 13669.5 | learning rate: 5.561E-06 | global batch size: 16 | lm loss: 7.246970E+00 | loss scale: 16384.0 | grad norm: 68938.329 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1254/ 159576 | consumed samples: 20064 | elapsed time per iteration (ms): 13534.9 | learning rate: 5.565E-06 | global batch size: 16 | lm loss: 7.505607E+00 | loss scale: 16384.0 | grad norm: 103078.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1255/ 159576 | consumed samples: 20080 | elapsed time per iteration (ms): 13594.8 | learning rate: 5.570E-06 | global batch size: 16 | lm loss: 7.386476E+00 | loss scale: 16384.0 | grad norm: 104529.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1256/ 159576 | consumed samples: 20096 | elapsed time per iteration (ms): 13795.8 | learning rate: 5.574E-06 | global batch size: 16 | lm loss: 7.263406E+00 | loss scale: 16384.0 | grad norm: 82840.246 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1257/ 159576 | consumed samples: 20112 | elapsed time per iteration (ms): 13529.7 | learning rate: 5.578E-06 | global batch size: 16 | lm loss: 7.356731E+00 | loss scale: 16384.0 | grad norm: 64612.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1258/ 159576 | consumed samples: 20128 | elapsed time per iteration (ms): 13538.7 | learning rate: 5.583E-06 | global batch size: 16 | lm loss: 7.516888E+00 | loss scale: 16384.0 | grad norm: 136048.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1259/ 159576 | consumed samples: 20144 | elapsed time per iteration (ms): 13556.0 | learning rate: 5.587E-06 | global batch size: 16 | lm loss: 7.352553E+00 | loss scale: 16384.0 | grad norm: 81380.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1260/ 159576 | consumed samples: 20160 | elapsed time per iteration (ms): 13488.1 | learning rate: 5.592E-06 | global batch size: 16 | lm loss: 7.385587E+00 | loss scale: 16384.0 | grad norm: 121637.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1261/ 159576 | consumed samples: 20176 | elapsed time per iteration (ms): 13803.4 | learning rate: 5.596E-06 | global batch size: 16 | lm loss: 7.280743E+00 | loss scale: 16384.0 | grad norm: 89726.532 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1262/ 159576 | consumed samples: 20192 | elapsed time per iteration (ms): 13426.2 | learning rate: 5.601E-06 | global batch size: 16 | lm loss: 7.512013E+00 | loss scale: 16384.0 | grad norm: 85518.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1263/ 159576 | consumed samples: 20208 | elapsed time per iteration (ms): 13492.1 | learning rate: 5.605E-06 | global batch size: 16 | lm loss: 7.145048E+00 | loss scale: 16384.0 | grad norm: 112279.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1264/ 159576 | consumed samples: 20224 | elapsed time per iteration (ms): 13537.9 | learning rate: 5.609E-06 | global batch size: 16 | lm loss: 7.608912E+00 | loss scale: 16384.0 | grad norm: 96612.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1265/ 159576 | consumed samples: 20240 | elapsed time per iteration (ms): 13857.6 | learning rate: 5.614E-06 | global batch size: 16 | lm loss: 7.316525E+00 | loss scale: 16384.0 | grad norm: 73736.489 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1266/ 159576 | consumed samples: 20256 | elapsed time per iteration (ms): 13475.3 | learning rate: 5.618E-06 | global batch size: 16 | lm loss: 7.406303E+00 | loss scale: 16384.0 | grad norm: 69485.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1267/ 159576 | consumed samples: 20272 | elapsed time per iteration (ms): 13513.4 | learning rate: 5.623E-06 | global batch size: 16 | lm loss: 7.282144E+00 | loss scale: 16384.0 | grad norm: 72619.526 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1268/ 159576 | consumed samples: 20288 | elapsed time per iteration (ms): 13517.8 | learning rate: 5.627E-06 | global batch size: 16 | lm loss: 7.419368E+00 | loss scale: 16384.0 | grad norm: 107085.697 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1269/ 159576 | consumed samples: 20304 | elapsed time per iteration (ms): 13507.2 | learning rate: 5.632E-06 | global batch size: 16 | lm loss: 7.427319E+00 | loss scale: 16384.0 | grad norm: 75455.531 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1270/ 159576 | consumed samples: 20320 | elapsed time per iteration (ms): 13744.8 | learning rate: 5.636E-06 | global batch size: 16 | lm loss: 7.348005E+00 | loss scale: 16384.0 | grad norm: 119801.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1271/ 159576 | consumed samples: 20336 | elapsed time per iteration (ms): 13569.3 | learning rate: 5.641E-06 | global batch size: 16 | lm loss: 7.365005E+00 | loss scale: 16384.0 | grad norm: 64957.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1272/ 159576 | consumed samples: 20352 | elapsed time per iteration (ms): 13569.6 | learning rate: 5.645E-06 | global batch size: 16 | lm loss: 7.429317E+00 | loss scale: 16384.0 | grad norm: 178872.228 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1273/ 159576 | consumed samples: 20368 | elapsed time per iteration (ms): 13472.8 | learning rate: 5.649E-06 | global batch size: 16 | lm loss: 7.312444E+00 | loss scale: 16384.0 | grad norm: 131489.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1274/ 159576 | consumed samples: 20384 | elapsed time per iteration (ms): 14043.7 | learning rate: 5.654E-06 | global batch size: 16 | lm loss: 7.280907E+00 | loss scale: 16384.0 | grad norm: 80742.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1275/ 159576 | consumed samples: 20400 | elapsed time per iteration (ms): 13515.6 | learning rate: 5.658E-06 | global batch size: 16 | lm loss: 7.473969E+00 | loss scale: 16384.0 | grad norm: 192617.575 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1276/ 159576 | consumed samples: 20416 | elapsed time per iteration (ms): 13555.1 | learning rate: 5.663E-06 | global batch size: 16 | lm loss: 7.571683E+00 | loss scale: 16384.0 | grad norm: 142231.827 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1277/ 159576 | consumed samples: 20432 | elapsed time per iteration (ms): 13684.0 | learning rate: 5.667E-06 | global batch size: 16 | lm loss: 7.370350E+00 | loss scale: 16384.0 | grad norm: 91290.772 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1278/ 159576 | consumed samples: 20448 | elapsed time per iteration (ms): 14108.9 | learning rate: 5.672E-06 | global batch size: 16 | lm loss: 7.258504E+00 | loss scale: 16384.0 | grad norm: 111985.269 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1279/ 159576 | consumed samples: 20464 | elapsed time per iteration (ms): 13599.8 | learning rate: 5.676E-06 | global batch size: 16 | lm loss: 7.378584E+00 | loss scale: 16384.0 | grad norm: 101238.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1280/ 159576 | consumed samples: 20480 | elapsed time per iteration (ms): 13689.3 | learning rate: 5.680E-06 | global batch size: 16 | lm loss: 7.344358E+00 | loss scale: 16384.0 | grad norm: 131175.820 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1281/ 159576 | consumed samples: 20496 | elapsed time per iteration (ms): 13675.0 | learning rate: 5.685E-06 | global batch size: 16 | lm loss: 7.253249E+00 | loss scale: 16384.0 | grad norm: 81245.877 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1282/ 159576 | consumed samples: 20512 | elapsed time per iteration (ms): 13723.8 | learning rate: 5.689E-06 | global batch size: 16 | lm loss: 7.385771E+00 | loss scale: 16384.0 | grad norm: 80281.812 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1283/ 159576 | consumed samples: 20528 | elapsed time per iteration (ms): 13839.8 | learning rate: 5.694E-06 | global batch size: 16 | lm loss: 7.253633E+00 | loss scale: 16384.0 | grad norm: 106168.685 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1284/ 159576 | consumed samples: 20544 | elapsed time per iteration (ms): 13645.0 | learning rate: 5.698E-06 | global batch size: 16 | lm loss: 7.091393E+00 | loss scale: 16384.0 | grad norm: 119249.818 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1285/ 159576 | consumed samples: 20560 | elapsed time per iteration (ms): 13673.3 | learning rate: 5.703E-06 | global batch size: 16 | lm loss: 7.346157E+00 | loss scale: 16384.0 | grad norm: 87118.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1286/ 159576 | consumed samples: 20576 | elapsed time per iteration (ms): 13680.7 | learning rate: 5.707E-06 | global batch size: 16 | lm loss: 7.301017E+00 | loss scale: 16384.0 | grad norm: 66813.094 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1287/ 159576 | consumed samples: 20592 | elapsed time per iteration (ms): 14107.0 | learning rate: 5.712E-06 | global batch size: 16 | lm loss: 7.228415E+00 | loss scale: 16384.0 | grad norm: 90274.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1288/ 159576 | consumed samples: 20608 | elapsed time per iteration (ms): 13593.6 | learning rate: 5.716E-06 | global batch size: 16 | lm loss: 7.412420E+00 | loss scale: 16384.0 | grad norm: 74854.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1289/ 159576 | consumed samples: 20624 | elapsed time per iteration (ms): 13657.4 | learning rate: 5.720E-06 | global batch size: 16 | lm loss: 7.296477E+00 | loss scale: 16384.0 | grad norm: 78756.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1290/ 159576 | consumed samples: 20640 | elapsed time per iteration (ms): 13628.7 | learning rate: 5.725E-06 | global batch size: 16 | lm loss: 7.091270E+00 | loss scale: 16384.0 | grad norm: 77550.258 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1291/ 159576 | consumed samples: 20656 | elapsed time per iteration (ms): 13654.9 | learning rate: 5.729E-06 | global batch size: 16 | lm loss: 7.247941E+00 | loss scale: 16384.0 | grad norm: 140565.268 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1292/ 159576 | consumed samples: 20672 | elapsed time per iteration (ms): 13789.5 | learning rate: 5.734E-06 | global batch size: 16 | lm loss: 7.326149E+00 | loss scale: 16384.0 | grad norm: 66170.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1293/ 159576 | consumed samples: 20688 | elapsed time per iteration (ms): 13629.3 | learning rate: 5.738E-06 | global batch size: 16 | lm loss: 7.358797E+00 | loss scale: 16384.0 | grad norm: 94692.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1294/ 159576 | consumed samples: 20704 | elapsed time per iteration (ms): 13584.0 | learning rate: 5.743E-06 | global batch size: 16 | lm loss: 7.254357E+00 | loss scale: 16384.0 | grad norm: 69169.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1295/ 159576 | consumed samples: 20720 | elapsed time per iteration (ms): 13612.6 | learning rate: 5.747E-06 | global batch size: 16 | lm loss: 7.449785E+00 | loss scale: 16384.0 | grad norm: 180039.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1296/ 159576 | consumed samples: 20736 | elapsed time per iteration (ms): 13948.4 | learning rate: 5.751E-06 | global batch size: 16 | lm loss: 7.506041E+00 | loss scale: 16384.0 | grad norm: 147606.074 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1297/ 159576 | consumed samples: 20752 | elapsed time per iteration (ms): 13604.2 | learning rate: 5.756E-06 | global batch size: 16 | lm loss: 7.265352E+00 | loss scale: 16384.0 | grad norm: 87511.848 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1298/ 159576 | consumed samples: 20768 | elapsed time per iteration (ms): 13622.0 | learning rate: 5.760E-06 | global batch size: 16 | lm loss: 7.446327E+00 | loss scale: 16384.0 | grad norm: 91155.668 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1299/ 159576 | consumed samples: 20784 | elapsed time per iteration (ms): 13674.5 | learning rate: 5.765E-06 | global batch size: 16 | lm loss: 7.469901E+00 | loss scale: 16384.0 | grad norm: 219048.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1300/ 159576 | consumed samples: 20800 | elapsed time per iteration (ms): 13848.4 | learning rate: 5.769E-06 | global batch size: 16 | lm loss: 7.389014E+00 | loss scale: 16384.0 | grad norm: 84402.094 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1301/ 159576 | consumed samples: 20816 | elapsed time per iteration (ms): 13625.0 | learning rate: 5.774E-06 | global batch size: 16 | lm loss: 7.303530E+00 | loss scale: 16384.0 | grad norm: 174901.504 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1302/ 159576 | consumed samples: 20832 | elapsed time per iteration (ms): 13624.5 | learning rate: 5.778E-06 | global batch size: 16 | lm loss: 7.358258E+00 | loss scale: 16384.0 | grad norm: 146018.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1303/ 159576 | consumed samples: 20848 | elapsed time per iteration (ms): 13602.8 | learning rate: 5.783E-06 | global batch size: 16 | lm loss: 7.337800E+00 | loss scale: 16384.0 | grad norm: 109327.316 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1304/ 159576 | consumed samples: 20864 | elapsed time per iteration (ms): 13628.1 | learning rate: 5.787E-06 | global batch size: 16 | lm loss: 7.310088E+00 | loss scale: 16384.0 | grad norm: 83547.733 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1305/ 159576 | consumed samples: 20880 | elapsed time per iteration (ms): 13754.8 | learning rate: 5.791E-06 | global batch size: 16 | lm loss: 7.464965E+00 | loss scale: 16384.0 | grad norm: 695515.315 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1306/ 159576 | consumed samples: 20896 | elapsed time per iteration (ms): 13652.7 | learning rate: 5.796E-06 | global batch size: 16 | lm loss: 7.764376E+00 | loss scale: 16384.0 | grad norm: 569876.871 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1307/ 159576 | consumed samples: 20912 | elapsed time per iteration (ms): 13609.0 | learning rate: 5.800E-06 | global batch size: 16 | lm loss: 7.550226E+00 | loss scale: 16384.0 | grad norm: 356748.186 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1308/ 159576 | consumed samples: 20928 | elapsed time per iteration (ms): 13602.6 | learning rate: 5.805E-06 | global batch size: 16 | lm loss: 7.402792E+00 | loss scale: 16384.0 | grad norm: 159267.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1309/ 159576 | consumed samples: 20944 | elapsed time per iteration (ms): 13968.8 | learning rate: 5.809E-06 | global batch size: 16 | lm loss: 7.204682E+00 | loss scale: 16384.0 | grad norm: 129995.340 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1310/ 159576 | consumed samples: 20960 | elapsed time per iteration (ms): 13646.5 | learning rate: 5.814E-06 | global batch size: 16 | lm loss: 7.591084E+00 | loss scale: 16384.0 | grad norm: 143380.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1311/ 159576 | consumed samples: 20976 | elapsed time per iteration (ms): 13595.1 | learning rate: 5.818E-06 | global batch size: 16 | lm loss: 7.316426E+00 | loss scale: 16384.0 | grad norm: 150593.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1312/ 159576 | consumed samples: 20992 | elapsed time per iteration (ms): 13595.5 | learning rate: 5.822E-06 | global batch size: 16 | lm loss: 7.305964E+00 | loss scale: 16384.0 | grad norm: 177049.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1313/ 159576 | consumed samples: 21008 | elapsed time per iteration (ms): 13979.9 | learning rate: 5.827E-06 | global batch size: 16 | lm loss: 7.567747E+00 | loss scale: 16384.0 | grad norm: 169809.702 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1314/ 159576 | consumed samples: 21024 | elapsed time per iteration (ms): 13640.7 | learning rate: 5.831E-06 | global batch size: 16 | lm loss: 7.395080E+00 | loss scale: 16384.0 | grad norm: 145564.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1315/ 159576 | consumed samples: 21040 | elapsed time per iteration (ms): 13592.0 | learning rate: 5.836E-06 | global batch size: 16 | lm loss: 7.317047E+00 | loss scale: 16384.0 | grad norm: 104694.703 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1316/ 159576 | consumed samples: 21056 | elapsed time per iteration (ms): 13586.9 | learning rate: 5.840E-06 | global batch size: 16 | lm loss: 7.255484E+00 | loss scale: 16384.0 | grad norm: 93976.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1317/ 159576 | consumed samples: 21072 | elapsed time per iteration (ms): 13589.9 | learning rate: 5.845E-06 | global batch size: 16 | lm loss: 7.440733E+00 | loss scale: 16384.0 | grad norm: 181969.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1318/ 159576 | consumed samples: 21088 | elapsed time per iteration (ms): 13777.5 | learning rate: 5.849E-06 | global batch size: 16 | lm loss: 7.425194E+00 | loss scale: 16384.0 | grad norm: 109784.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1319/ 159576 | consumed samples: 21104 | elapsed time per iteration (ms): 13622.9 | learning rate: 5.854E-06 | global batch size: 16 | lm loss: 7.338997E+00 | loss scale: 16384.0 | grad norm: 146618.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1320/ 159576 | consumed samples: 21120 | elapsed time per iteration (ms): 13655.9 | learning rate: 5.858E-06 | global batch size: 16 | lm loss: 7.517268E+00 | loss scale: 16384.0 | grad norm: 108508.882 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1321/ 159576 | consumed samples: 21136 | elapsed time per iteration (ms): 13535.6 | learning rate: 5.862E-06 | global batch size: 16 | lm loss: 7.358712E+00 | loss scale: 16384.0 | grad norm: 100699.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1322/ 159576 | consumed samples: 21152 | elapsed time per iteration (ms): 13935.1 | learning rate: 5.867E-06 | global batch size: 16 | lm loss: 7.184452E+00 | loss scale: 16384.0 | grad norm: 85896.066 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1323/ 159576 | consumed samples: 21168 | elapsed time per iteration (ms): 13612.2 | learning rate: 5.871E-06 | global batch size: 16 | lm loss: 7.388680E+00 | loss scale: 16384.0 | grad norm: 283765.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1324/ 159576 | consumed samples: 21184 | elapsed time per iteration (ms): 13600.2 | learning rate: 5.876E-06 | global batch size: 16 | lm loss: 7.594103E+00 | loss scale: 16384.0 | grad norm: 191758.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1325/ 159576 | consumed samples: 21200 | elapsed time per iteration (ms): 13592.0 | learning rate: 5.880E-06 | global batch size: 16 | lm loss: 7.443296E+00 | loss scale: 16384.0 | grad norm: 112255.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1326/ 159576 | consumed samples: 21216 | elapsed time per iteration (ms): 13594.2 | learning rate: 5.885E-06 | global batch size: 16 | lm loss: 7.192332E+00 | loss scale: 16384.0 | grad norm: 110320.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1327/ 159576 | consumed samples: 21232 | elapsed time per iteration (ms): 13762.8 | learning rate: 5.889E-06 | global batch size: 16 | lm loss: 8.096416E+00 | loss scale: 16384.0 | grad norm: 131448.164 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1328/ 159576 | consumed samples: 21248 | elapsed time per iteration (ms): 13579.8 | learning rate: 5.893E-06 | global batch size: 16 | lm loss: 7.433802E+00 | loss scale: 16384.0 | grad norm: 182837.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1329/ 159576 | consumed samples: 21264 | elapsed time per iteration (ms): 13581.7 | learning rate: 5.898E-06 | global batch size: 16 | lm loss: 7.172110E+00 | loss scale: 16384.0 | grad norm: 100348.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1330/ 159576 | consumed samples: 21280 | elapsed time per iteration (ms): 13583.6 | learning rate: 5.902E-06 | global batch size: 16 | lm loss: 7.240623E+00 | loss scale: 16384.0 | grad norm: 100150.341 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1331/ 159576 | consumed samples: 21296 | elapsed time per iteration (ms): 14102.4 | learning rate: 5.907E-06 | global batch size: 16 | lm loss: 7.203824E+00 | loss scale: 16384.0 | grad norm: 241560.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1332/ 159576 | consumed samples: 21312 | elapsed time per iteration (ms): 13644.3 | learning rate: 5.911E-06 | global batch size: 16 | lm loss: 7.245723E+00 | loss scale: 16384.0 | grad norm: 129411.280 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1333/ 159576 | consumed samples: 21328 | elapsed time per iteration (ms): 13656.6 | learning rate: 5.916E-06 | global batch size: 16 | lm loss: 7.574631E+00 | loss scale: 16384.0 | grad norm: 172987.034 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1334/ 159576 | consumed samples: 21344 | elapsed time per iteration (ms): 13588.8 | learning rate: 5.920E-06 | global batch size: 16 | lm loss: 7.287757E+00 | loss scale: 16384.0 | grad norm: 99651.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1335/ 159576 | consumed samples: 21360 | elapsed time per iteration (ms): 14011.8 | learning rate: 5.925E-06 | global batch size: 16 | lm loss: 7.268057E+00 | loss scale: 16384.0 | grad norm: 109280.402 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1336/ 159576 | consumed samples: 21376 | elapsed time per iteration (ms): 13624.4 | learning rate: 5.929E-06 | global batch size: 16 | lm loss: 7.062439E+00 | loss scale: 16384.0 | grad norm: 160438.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1337/ 159576 | consumed samples: 21392 | elapsed time per iteration (ms): 13544.1 | learning rate: 5.933E-06 | global batch size: 16 | lm loss: 7.233086E+00 | loss scale: 16384.0 | grad norm: 175313.966 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1338/ 159576 | consumed samples: 21408 | elapsed time per iteration (ms): 13619.6 | learning rate: 5.938E-06 | global batch size: 16 | lm loss: 7.333053E+00 | loss scale: 16384.0 | grad norm: 104091.148 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1339/ 159576 | consumed samples: 21424 | elapsed time per iteration (ms): 13622.4 | learning rate: 5.942E-06 | global batch size: 16 | lm loss: 7.263519E+00 | loss scale: 16384.0 | grad norm: 90175.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1340/ 159576 | consumed samples: 21440 | elapsed time per iteration (ms): 13736.6 | learning rate: 5.947E-06 | global batch size: 16 | lm loss: 7.445864E+00 | loss scale: 16384.0 | grad norm: 136689.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1341/ 159576 | consumed samples: 21456 | elapsed time per iteration (ms): 13686.3 | learning rate: 5.951E-06 | global batch size: 16 | lm loss: 7.362231E+00 | loss scale: 16384.0 | grad norm: 184602.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1342/ 159576 | consumed samples: 21472 | elapsed time per iteration (ms): 13488.8 | learning rate: 5.956E-06 | global batch size: 16 | lm loss: 7.368071E+00 | loss scale: 16384.0 | grad norm: 82633.413 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1343/ 159576 | consumed samples: 21488 | elapsed time per iteration (ms): 13605.8 | learning rate: 5.960E-06 | global batch size: 16 | lm loss: 7.327272E+00 | loss scale: 16384.0 | grad norm: 92741.507 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1344/ 159576 | consumed samples: 21504 | elapsed time per iteration (ms): 14069.0 | learning rate: 5.964E-06 | global batch size: 16 | lm loss: 7.323634E+00 | loss scale: 16384.0 | grad norm: 99780.106 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1345/ 159576 | consumed samples: 21520 | elapsed time per iteration (ms): 13450.7 | learning rate: 5.969E-06 | global batch size: 16 | lm loss: 7.741362E+00 | loss scale: 16384.0 | grad norm: 105396.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1346/ 159576 | consumed samples: 21536 | elapsed time per iteration (ms): 13598.3 | learning rate: 5.973E-06 | global batch size: 16 | lm loss: 7.280247E+00 | loss scale: 16384.0 | grad norm: 77724.692 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1347/ 159576 | consumed samples: 21552 | elapsed time per iteration (ms): 13585.6 | learning rate: 5.978E-06 | global batch size: 16 | lm loss: 7.398378E+00 | loss scale: 16384.0 | grad norm: 69954.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1348/ 159576 | consumed samples: 21568 | elapsed time per iteration (ms): 13610.3 | learning rate: 5.982E-06 | global batch size: 16 | lm loss: 7.321609E+00 | loss scale: 16384.0 | grad norm: 94086.734 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1349/ 159576 | consumed samples: 21584 | elapsed time per iteration (ms): 13777.1 | learning rate: 5.987E-06 | global batch size: 16 | lm loss: 7.188628E+00 | loss scale: 16384.0 | grad norm: 81475.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1350/ 159576 | consumed samples: 21600 | elapsed time per iteration (ms): 13566.9 | learning rate: 5.991E-06 | global batch size: 16 | lm loss: 7.515175E+00 | loss scale: 16384.0 | grad norm: 78780.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1351/ 159576 | consumed samples: 21616 | elapsed time per iteration (ms): 13622.9 | learning rate: 5.996E-06 | global batch size: 16 | lm loss: 7.231083E+00 | loss scale: 16384.0 | grad norm: 86153.703 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1352/ 159576 | consumed samples: 21632 | elapsed time per iteration (ms): 13562.3 | learning rate: 6.000E-06 | global batch size: 16 | lm loss: 7.206710E+00 | loss scale: 16384.0 | grad norm: 83949.216 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1353/ 159576 | consumed samples: 21648 | elapsed time per iteration (ms): 13968.8 | learning rate: 6.004E-06 | global batch size: 16 | lm loss: 7.293135E+00 | loss scale: 16384.0 | grad norm: 83956.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1354/ 159576 | consumed samples: 21664 | elapsed time per iteration (ms): 13680.7 | learning rate: 6.009E-06 | global batch size: 16 | lm loss: 7.282973E+00 | loss scale: 16384.0 | grad norm: 102770.063 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1355/ 159576 | consumed samples: 21680 | elapsed time per iteration (ms): 13601.4 | learning rate: 6.013E-06 | global batch size: 16 | lm loss: 7.427012E+00 | loss scale: 16384.0 | grad norm: 87455.923 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1356/ 159576 | consumed samples: 21696 | elapsed time per iteration (ms): 13542.1 | learning rate: 6.018E-06 | global batch size: 16 | lm loss: 7.529208E+00 | loss scale: 16384.0 | grad norm: 83130.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1357/ 159576 | consumed samples: 21712 | elapsed time per iteration (ms): 13961.0 | learning rate: 6.022E-06 | global batch size: 16 | lm loss: 7.327049E+00 | loss scale: 16384.0 | grad norm: 77841.440 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1358/ 159576 | consumed samples: 21728 | elapsed time per iteration (ms): 13587.5 | learning rate: 6.027E-06 | global batch size: 16 | lm loss: 7.267120E+00 | loss scale: 16384.0 | grad norm: 86295.759 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1359/ 159576 | consumed samples: 21744 | elapsed time per iteration (ms): 13505.9 | learning rate: 6.031E-06 | global batch size: 16 | lm loss: 7.190462E+00 | loss scale: 16384.0 | grad norm: 154865.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1360/ 159576 | consumed samples: 21760 | elapsed time per iteration (ms): 13616.0 | learning rate: 6.036E-06 | global batch size: 16 | lm loss: 7.321602E+00 | loss scale: 16384.0 | grad norm: 112461.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1361/ 159576 | consumed samples: 21776 | elapsed time per iteration (ms): 13547.3 | learning rate: 6.040E-06 | global batch size: 16 | lm loss: 7.145373E+00 | loss scale: 16384.0 | grad norm: 72055.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1362/ 159576 | consumed samples: 21792 | elapsed time per iteration (ms): 13692.3 | learning rate: 6.044E-06 | global batch size: 16 | lm loss: 7.077173E+00 | loss scale: 16384.0 | grad norm: 103896.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1363/ 159576 | consumed samples: 21808 | elapsed time per iteration (ms): 13612.5 | learning rate: 6.049E-06 | global batch size: 16 | lm loss: 7.245114E+00 | loss scale: 16384.0 | grad norm: 79354.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1364/ 159576 | consumed samples: 21824 | elapsed time per iteration (ms): 13541.3 | learning rate: 6.053E-06 | global batch size: 16 | lm loss: 7.281060E+00 | loss scale: 16384.0 | grad norm: 148274.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1365/ 159576 | consumed samples: 21840 | elapsed time per iteration (ms): 13609.2 | learning rate: 6.058E-06 | global batch size: 16 | lm loss: 7.401906E+00 | loss scale: 16384.0 | grad norm: 119123.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1366/ 159576 | consumed samples: 21856 | elapsed time per iteration (ms): 13916.7 | learning rate: 6.062E-06 | global batch size: 16 | lm loss: 7.338102E+00 | loss scale: 16384.0 | grad norm: 93708.417 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1367/ 159576 | consumed samples: 21872 | elapsed time per iteration (ms): 13536.5 | learning rate: 6.067E-06 | global batch size: 16 | lm loss: 7.494397E+00 | loss scale: 16384.0 | grad norm: 130779.852 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1368/ 159576 | consumed samples: 21888 | elapsed time per iteration (ms): 13577.1 | learning rate: 6.071E-06 | global batch size: 16 | lm loss: 7.007359E+00 | loss scale: 16384.0 | grad norm: 94271.242 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1369/ 159576 | consumed samples: 21904 | elapsed time per iteration (ms): 13571.4 | learning rate: 6.075E-06 | global batch size: 16 | lm loss: 7.129241E+00 | loss scale: 16384.0 | grad norm: 129962.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1370/ 159576 | consumed samples: 21920 | elapsed time per iteration (ms): 13603.2 | learning rate: 6.080E-06 | global batch size: 16 | lm loss: 7.323318E+00 | loss scale: 16384.0 | grad norm: 138541.774 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1371/ 159576 | consumed samples: 21936 | elapsed time per iteration (ms): 13998.6 | learning rate: 6.084E-06 | global batch size: 16 | lm loss: 7.164912E+00 | loss scale: 16384.0 | grad norm: 95366.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1372/ 159576 | consumed samples: 21952 | elapsed time per iteration (ms): 13587.8 | learning rate: 6.089E-06 | global batch size: 16 | lm loss: 7.207436E+00 | loss scale: 16384.0 | grad norm: 95481.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1373/ 159576 | consumed samples: 21968 | elapsed time per iteration (ms): 13570.1 | learning rate: 6.093E-06 | global batch size: 16 | lm loss: 7.245305E+00 | loss scale: 16384.0 | grad norm: 110814.337 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1374/ 159576 | consumed samples: 21984 | elapsed time per iteration (ms): 13553.5 | learning rate: 6.098E-06 | global batch size: 16 | lm loss: 7.184179E+00 | loss scale: 16384.0 | grad norm: 92107.034 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1375/ 159576 | consumed samples: 22000 | elapsed time per iteration (ms): 13994.4 | learning rate: 6.102E-06 | global batch size: 16 | lm loss: 7.117487E+00 | loss scale: 16384.0 | grad norm: 77237.913 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1376/ 159576 | consumed samples: 22016 | elapsed time per iteration (ms): 13625.6 | learning rate: 6.107E-06 | global batch size: 16 | lm loss: 7.445632E+00 | loss scale: 16384.0 | grad norm: 139111.184 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1377/ 159576 | consumed samples: 22032 | elapsed time per iteration (ms): 13559.3 | learning rate: 6.111E-06 | global batch size: 16 | lm loss: 7.513434E+00 | loss scale: 16384.0 | grad norm: 111307.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1378/ 159576 | consumed samples: 22048 | elapsed time per iteration (ms): 13608.4 | learning rate: 6.115E-06 | global batch size: 16 | lm loss: 7.255265E+00 | loss scale: 16384.0 | grad norm: 88273.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1379/ 159576 | consumed samples: 22064 | elapsed time per iteration (ms): 14048.5 | learning rate: 6.120E-06 | global batch size: 16 | lm loss: 7.123577E+00 | loss scale: 16384.0 | grad norm: 85346.614 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1380/ 159576 | consumed samples: 22080 | elapsed time per iteration (ms): 13485.1 | learning rate: 6.124E-06 | global batch size: 16 | lm loss: 7.134797E+00 | loss scale: 16384.0 | grad norm: 118284.165 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1381/ 159576 | consumed samples: 22096 | elapsed time per iteration (ms): 13616.6 | learning rate: 6.129E-06 | global batch size: 16 | lm loss: 7.281054E+00 | loss scale: 16384.0 | grad norm: 88229.446 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1382/ 159576 | consumed samples: 22112 | elapsed time per iteration (ms): 13576.6 | learning rate: 6.133E-06 | global batch size: 16 | lm loss: 7.397271E+00 | loss scale: 16384.0 | grad norm: 130821.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1383/ 159576 | consumed samples: 22128 | elapsed time per iteration (ms): 13587.8 | learning rate: 6.138E-06 | global batch size: 16 | lm loss: 7.362026E+00 | loss scale: 16384.0 | grad norm: 83450.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1384/ 159576 | consumed samples: 22144 | elapsed time per iteration (ms): 13848.8 | learning rate: 6.142E-06 | global batch size: 16 | lm loss: 7.275143E+00 | loss scale: 16384.0 | grad norm: 86287.774 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1385/ 159576 | consumed samples: 22160 | elapsed time per iteration (ms): 13576.9 | learning rate: 6.146E-06 | global batch size: 16 | lm loss: 7.400926E+00 | loss scale: 16384.0 | grad norm: 98321.914 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1386/ 159576 | consumed samples: 22176 | elapsed time per iteration (ms): 13627.2 | learning rate: 6.151E-06 | global batch size: 16 | lm loss: 7.151899E+00 | loss scale: 16384.0 | grad norm: 85060.501 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1387/ 159576 | consumed samples: 22192 | elapsed time per iteration (ms): 13519.4 | learning rate: 6.155E-06 | global batch size: 16 | lm loss: 7.335835E+00 | loss scale: 16384.0 | grad norm: 64450.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1388/ 159576 | consumed samples: 22208 | elapsed time per iteration (ms): 13906.1 | learning rate: 6.160E-06 | global batch size: 16 | lm loss: 7.316273E+00 | loss scale: 16384.0 | grad norm: 66517.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1389/ 159576 | consumed samples: 22224 | elapsed time per iteration (ms): 13589.2 | learning rate: 6.164E-06 | global batch size: 16 | lm loss: 7.190707E+00 | loss scale: 16384.0 | grad norm: 123710.931 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1390/ 159576 | consumed samples: 22240 | elapsed time per iteration (ms): 13545.5 | learning rate: 6.169E-06 | global batch size: 16 | lm loss: 7.337936E+00 | loss scale: 16384.0 | grad norm: 78178.349 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1391/ 159576 | consumed samples: 22256 | elapsed time per iteration (ms): 13564.6 | learning rate: 6.173E-06 | global batch size: 16 | lm loss: 7.539785E+00 | loss scale: 16384.0 | grad norm: 111563.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1392/ 159576 | consumed samples: 22272 | elapsed time per iteration (ms): 13891.4 | learning rate: 6.178E-06 | global batch size: 16 | lm loss: 7.071362E+00 | loss scale: 16384.0 | grad norm: 70647.575 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1393/ 159576 | consumed samples: 22288 | elapsed time per iteration (ms): 13681.2 | learning rate: 6.182E-06 | global batch size: 16 | lm loss: 7.133610E+00 | loss scale: 16384.0 | grad norm: 124103.863 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1394/ 159576 | consumed samples: 22304 | elapsed time per iteration (ms): 13531.0 | learning rate: 6.186E-06 | global batch size: 16 | lm loss: 7.323411E+00 | loss scale: 16384.0 | grad norm: 99951.813 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1395/ 159576 | consumed samples: 22320 | elapsed time per iteration (ms): 13568.0 | learning rate: 6.191E-06 | global batch size: 16 | lm loss: 7.184701E+00 | loss scale: 16384.0 | grad norm: 71905.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1396/ 159576 | consumed samples: 22336 | elapsed time per iteration (ms): 13541.4 | learning rate: 6.195E-06 | global batch size: 16 | lm loss: 7.166233E+00 | loss scale: 16384.0 | grad norm: 81874.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1397/ 159576 | consumed samples: 22352 | elapsed time per iteration (ms): 13897.4 | learning rate: 6.200E-06 | global batch size: 16 | lm loss: 7.247505E+00 | loss scale: 16384.0 | grad norm: 84059.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1398/ 159576 | consumed samples: 22368 | elapsed time per iteration (ms): 13621.5 | learning rate: 6.204E-06 | global batch size: 16 | lm loss: 7.240150E+00 | loss scale: 16384.0 | grad norm: 119489.831 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1399/ 159576 | consumed samples: 22384 | elapsed time per iteration (ms): 13579.9 | learning rate: 6.209E-06 | global batch size: 16 | lm loss: 7.294222E+00 | loss scale: 16384.0 | grad norm: 80417.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1400/ 159576 | consumed samples: 22400 | elapsed time per iteration (ms): 13625.0 | learning rate: 6.213E-06 | global batch size: 16 | lm loss: 7.203695E+00 | loss scale: 16384.0 | grad norm: 97654.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1401/ 159576 | consumed samples: 22416 | elapsed time per iteration (ms): 14002.5 | learning rate: 6.217E-06 | global batch size: 16 | lm loss: 7.173908E+00 | loss scale: 16384.0 | grad norm: 72597.723 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1402/ 159576 | consumed samples: 22432 | elapsed time per iteration (ms): 13559.2 | learning rate: 6.222E-06 | global batch size: 16 | lm loss: 7.213487E+00 | loss scale: 16384.0 | grad norm: 108337.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1403/ 159576 | consumed samples: 22448 | elapsed time per iteration (ms): 13615.0 | learning rate: 6.226E-06 | global batch size: 16 | lm loss: 7.295056E+00 | loss scale: 16384.0 | grad norm: 109464.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1404/ 159576 | consumed samples: 22464 | elapsed time per iteration (ms): 13479.3 | learning rate: 6.231E-06 | global batch size: 16 | lm loss: 7.070762E+00 | loss scale: 16384.0 | grad norm: 70008.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1405/ 159576 | consumed samples: 22480 | elapsed time per iteration (ms): 13573.2 | learning rate: 6.235E-06 | global batch size: 16 | lm loss: 7.206651E+00 | loss scale: 16384.0 | grad norm: 71456.680 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1406/ 159576 | consumed samples: 22496 | elapsed time per iteration (ms): 13670.7 | learning rate: 6.240E-06 | global batch size: 16 | lm loss: 7.421339E+00 | loss scale: 16384.0 | grad norm: 81529.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1407/ 159576 | consumed samples: 22512 | elapsed time per iteration (ms): 13510.9 | learning rate: 6.244E-06 | global batch size: 16 | lm loss: 7.245395E+00 | loss scale: 16384.0 | grad norm: 120780.179 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1408/ 159576 | consumed samples: 22528 | elapsed time per iteration (ms): 13544.4 | learning rate: 6.249E-06 | global batch size: 16 | lm loss: 7.479702E+00 | loss scale: 16384.0 | grad norm: 98091.848 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1409/ 159576 | consumed samples: 22544 | elapsed time per iteration (ms): 13558.7 | learning rate: 6.253E-06 | global batch size: 16 | lm loss: 7.220355E+00 | loss scale: 16384.0 | grad norm: 71818.367 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1410/ 159576 | consumed samples: 22560 | elapsed time per iteration (ms): 13949.7 | learning rate: 6.257E-06 | global batch size: 16 | lm loss: 7.381415E+00 | loss scale: 16384.0 | grad norm: 80168.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1411/ 159576 | consumed samples: 22576 | elapsed time per iteration (ms): 13573.4 | learning rate: 6.262E-06 | global batch size: 16 | lm loss: 7.330766E+00 | loss scale: 16384.0 | grad norm: 107261.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1412/ 159576 | consumed samples: 22592 | elapsed time per iteration (ms): 13522.9 | learning rate: 6.266E-06 | global batch size: 16 | lm loss: 7.378265E+00 | loss scale: 16384.0 | grad norm: 115619.714 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1413/ 159576 | consumed samples: 22608 | elapsed time per iteration (ms): 13584.4 | learning rate: 6.271E-06 | global batch size: 16 | lm loss: 7.202836E+00 | loss scale: 16384.0 | grad norm: 70230.767 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1414/ 159576 | consumed samples: 22624 | elapsed time per iteration (ms): 13797.1 | learning rate: 6.275E-06 | global batch size: 16 | lm loss: 7.202533E+00 | loss scale: 16384.0 | grad norm: 122640.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1415/ 159576 | consumed samples: 22640 | elapsed time per iteration (ms): 13736.9 | learning rate: 6.280E-06 | global batch size: 16 | lm loss: 7.271989E+00 | loss scale: 16384.0 | grad norm: 80706.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1416/ 159576 | consumed samples: 22656 | elapsed time per iteration (ms): 13603.3 | learning rate: 6.284E-06 | global batch size: 16 | lm loss: 7.350783E+00 | loss scale: 16384.0 | grad norm: 106402.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1417/ 159576 | consumed samples: 22672 | elapsed time per iteration (ms): 13663.2 | learning rate: 6.288E-06 | global batch size: 16 | lm loss: 7.629884E+00 | loss scale: 16384.0 | grad norm: 111978.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1418/ 159576 | consumed samples: 22688 | elapsed time per iteration (ms): 13512.0 | learning rate: 6.293E-06 | global batch size: 16 | lm loss: 7.276966E+00 | loss scale: 16384.0 | grad norm: 86564.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1419/ 159576 | consumed samples: 22704 | elapsed time per iteration (ms): 13947.9 | learning rate: 6.297E-06 | global batch size: 16 | lm loss: 7.109100E+00 | loss scale: 16384.0 | grad norm: 85621.258 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1420/ 159576 | consumed samples: 22720 | elapsed time per iteration (ms): 13554.6 | learning rate: 6.302E-06 | global batch size: 16 | lm loss: 7.234724E+00 | loss scale: 16384.0 | grad norm: 115238.437 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1421/ 159576 | consumed samples: 22736 | elapsed time per iteration (ms): 13608.2 | learning rate: 6.306E-06 | global batch size: 16 | lm loss: 7.134557E+00 | loss scale: 16384.0 | grad norm: 127475.605 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1422/ 159576 | consumed samples: 22752 | elapsed time per iteration (ms): 13564.6 | learning rate: 6.311E-06 | global batch size: 16 | lm loss: 7.096246E+00 | loss scale: 16384.0 | grad norm: 92678.765 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1423/ 159576 | consumed samples: 22768 | elapsed time per iteration (ms): 13993.7 | learning rate: 6.315E-06 | global batch size: 16 | lm loss: 7.215540E+00 | loss scale: 16384.0 | grad norm: 77823.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1424/ 159576 | consumed samples: 22784 | elapsed time per iteration (ms): 13635.8 | learning rate: 6.320E-06 | global batch size: 16 | lm loss: 7.332169E+00 | loss scale: 16384.0 | grad norm: 88585.736 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1425/ 159576 | consumed samples: 22800 | elapsed time per iteration (ms): 13477.0 | learning rate: 6.324E-06 | global batch size: 16 | lm loss: 7.224688E+00 | loss scale: 16384.0 | grad norm: 98593.171 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1426/ 159576 | consumed samples: 22816 | elapsed time per iteration (ms): 13579.9 | learning rate: 6.328E-06 | global batch size: 16 | lm loss: 7.330650E+00 | loss scale: 16384.0 | grad norm: 101929.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1427/ 159576 | consumed samples: 22832 | elapsed time per iteration (ms): 13559.4 | learning rate: 6.333E-06 | global batch size: 16 | lm loss: 7.261027E+00 | loss scale: 16384.0 | grad norm: 79893.479 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1428/ 159576 | consumed samples: 22848 | elapsed time per iteration (ms): 13656.6 | learning rate: 6.337E-06 | global batch size: 16 | lm loss: 7.050019E+00 | loss scale: 16384.0 | grad norm: 197668.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1429/ 159576 | consumed samples: 22864 | elapsed time per iteration (ms): 13549.3 | learning rate: 6.342E-06 | global batch size: 16 | lm loss: 7.283052E+00 | loss scale: 16384.0 | grad norm: 185482.345 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1430/ 159576 | consumed samples: 22880 | elapsed time per iteration (ms): 13566.6 | learning rate: 6.346E-06 | global batch size: 16 | lm loss: 7.251038E+00 | loss scale: 16384.0 | grad norm: 81246.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1431/ 159576 | consumed samples: 22896 | elapsed time per iteration (ms): 13626.6 | learning rate: 6.351E-06 | global batch size: 16 | lm loss: 7.363044E+00 | loss scale: 16384.0 | grad norm: 89555.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1432/ 159576 | consumed samples: 22912 | elapsed time per iteration (ms): 14023.4 | learning rate: 6.355E-06 | global batch size: 16 | lm loss: 7.350190E+00 | loss scale: 16384.0 | grad norm: 151476.896 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1433/ 159576 | consumed samples: 22928 | elapsed time per iteration (ms): 13376.0 | learning rate: 6.359E-06 | global batch size: 16 | lm loss: 7.294331E+00 | loss scale: 16384.0 | grad norm: 148300.162 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1434/ 159576 | consumed samples: 22944 | elapsed time per iteration (ms): 13594.6 | learning rate: 6.364E-06 | global batch size: 16 | lm loss: 7.178850E+00 | loss scale: 16384.0 | grad norm: 115814.774 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1435/ 159576 | consumed samples: 22960 | elapsed time per iteration (ms): 13589.5 | learning rate: 6.368E-06 | global batch size: 16 | lm loss: 7.174537E+00 | loss scale: 16384.0 | grad norm: 89057.264 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1436/ 159576 | consumed samples: 22976 | elapsed time per iteration (ms): 13854.5 | learning rate: 6.373E-06 | global batch size: 16 | lm loss: 7.455090E+00 | loss scale: 16384.0 | grad norm: 143357.692 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1437/ 159576 | consumed samples: 22992 | elapsed time per iteration (ms): 13800.5 | learning rate: 6.377E-06 | global batch size: 16 | lm loss: 7.230480E+00 | loss scale: 16384.0 | grad norm: 124647.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1438/ 159576 | consumed samples: 23008 | elapsed time per iteration (ms): 13574.3 | learning rate: 6.382E-06 | global batch size: 16 | lm loss: 7.214196E+00 | loss scale: 16384.0 | grad norm: 90534.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1439/ 159576 | consumed samples: 23024 | elapsed time per iteration (ms): 13559.7 | learning rate: 6.386E-06 | global batch size: 16 | lm loss: 7.228687E+00 | loss scale: 16384.0 | grad norm: 100823.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1440/ 159576 | consumed samples: 23040 | elapsed time per iteration (ms): 13580.1 | learning rate: 6.391E-06 | global batch size: 16 | lm loss: 7.297411E+00 | loss scale: 16384.0 | grad norm: 72207.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1441/ 159576 | consumed samples: 23056 | elapsed time per iteration (ms): 13763.6 | learning rate: 6.395E-06 | global batch size: 16 | lm loss: 7.403437E+00 | loss scale: 16384.0 | grad norm: 227400.170 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1442/ 159576 | consumed samples: 23072 | elapsed time per iteration (ms): 13606.0 | learning rate: 6.399E-06 | global batch size: 16 | lm loss: 7.267770E+00 | loss scale: 16384.0 | grad norm: 178424.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1443/ 159576 | consumed samples: 23088 | elapsed time per iteration (ms): 13579.5 | learning rate: 6.404E-06 | global batch size: 16 | lm loss: 7.196310E+00 | loss scale: 16384.0 | grad norm: 93737.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1444/ 159576 | consumed samples: 23104 | elapsed time per iteration (ms): 13564.8 | learning rate: 6.408E-06 | global batch size: 16 | lm loss: 7.180475E+00 | loss scale: 16384.0 | grad norm: 107567.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1445/ 159576 | consumed samples: 23120 | elapsed time per iteration (ms): 14086.1 | learning rate: 6.413E-06 | global batch size: 16 | lm loss: 7.235699E+00 | loss scale: 16384.0 | grad norm: 90017.706 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1446/ 159576 | consumed samples: 23136 | elapsed time per iteration (ms): 13420.4 | learning rate: 6.417E-06 | global batch size: 16 | lm loss: 7.131771E+00 | loss scale: 16384.0 | grad norm: 200715.783 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1447/ 159576 | consumed samples: 23152 | elapsed time per iteration (ms): 13582.8 | learning rate: 6.422E-06 | global batch size: 16 | lm loss: 7.147336E+00 | loss scale: 16384.0 | grad norm: 139041.379 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1448/ 159576 | consumed samples: 23168 | elapsed time per iteration (ms): 13591.5 | learning rate: 6.426E-06 | global batch size: 16 | lm loss: 7.223548E+00 | loss scale: 16384.0 | grad norm: 81314.906 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1449/ 159576 | consumed samples: 23184 | elapsed time per iteration (ms): 13543.2 | learning rate: 6.430E-06 | global batch size: 16 | lm loss: 7.126436E+00 | loss scale: 16384.0 | grad norm: 104656.231 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1450/ 159576 | consumed samples: 23200 | elapsed time per iteration (ms): 13771.0 | learning rate: 6.435E-06 | global batch size: 16 | lm loss: 7.239769E+00 | loss scale: 16384.0 | grad norm: 55782.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1451/ 159576 | consumed samples: 23216 | elapsed time per iteration (ms): 13581.7 | learning rate: 6.439E-06 | global batch size: 16 | lm loss: 7.431156E+00 | loss scale: 16384.0 | grad norm: 265376.495 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1452/ 159576 | consumed samples: 23232 | elapsed time per iteration (ms): 13633.4 | learning rate: 6.444E-06 | global batch size: 16 | lm loss: 7.120412E+00 | loss scale: 16384.0 | grad norm: 153821.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1453/ 159576 | consumed samples: 23248 | elapsed time per iteration (ms): 13510.9 | learning rate: 6.448E-06 | global batch size: 16 | lm loss: 7.361814E+00 | loss scale: 16384.0 | grad norm: 91484.610 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1454/ 159576 | consumed samples: 23264 | elapsed time per iteration (ms): 14008.9 | learning rate: 6.453E-06 | global batch size: 16 | lm loss: 7.429213E+00 | loss scale: 16384.0 | grad norm: 95193.402 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1455/ 159576 | consumed samples: 23280 | elapsed time per iteration (ms): 13534.7 | learning rate: 6.457E-06 | global batch size: 16 | lm loss: 7.311771E+00 | loss scale: 16384.0 | grad norm: 99688.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1456/ 159576 | consumed samples: 23296 | elapsed time per iteration (ms): 13570.9 | learning rate: 6.462E-06 | global batch size: 16 | lm loss: 7.326795E+00 | loss scale: 16384.0 | grad norm: 199002.918 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1457/ 159576 | consumed samples: 23312 | elapsed time per iteration (ms): 13567.6 | learning rate: 6.466E-06 | global batch size: 16 | lm loss: 7.238305E+00 | loss scale: 16384.0 | grad norm: 148524.516 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1458/ 159576 | consumed samples: 23328 | elapsed time per iteration (ms): 14002.9 | learning rate: 6.470E-06 | global batch size: 16 | lm loss: 7.170752E+00 | loss scale: 16384.0 | grad norm: 83892.787 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1459/ 159576 | consumed samples: 23344 | elapsed time per iteration (ms): 13758.9 | learning rate: 6.475E-06 | global batch size: 16 | lm loss: 7.148302E+00 | loss scale: 16384.0 | grad norm: 92326.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1460/ 159576 | consumed samples: 23360 | elapsed time per iteration (ms): 13596.9 | learning rate: 6.479E-06 | global batch size: 16 | lm loss: 7.386099E+00 | loss scale: 16384.0 | grad norm: 141912.785 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1461/ 159576 | consumed samples: 23376 | elapsed time per iteration (ms): 13627.4 | learning rate: 6.484E-06 | global batch size: 16 | lm loss: 7.288848E+00 | loss scale: 16384.0 | grad norm: 170265.777 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1462/ 159576 | consumed samples: 23392 | elapsed time per iteration (ms): 13618.4 | learning rate: 6.488E-06 | global batch size: 16 | lm loss: 7.229756E+00 | loss scale: 16384.0 | grad norm: 120999.804 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1463/ 159576 | consumed samples: 23408 | elapsed time per iteration (ms): 13656.7 | learning rate: 6.493E-06 | global batch size: 16 | lm loss: 7.281564E+00 | loss scale: 16384.0 | grad norm: 93039.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1464/ 159576 | consumed samples: 23424 | elapsed time per iteration (ms): 13645.1 | learning rate: 6.497E-06 | global batch size: 16 | lm loss: 7.287534E+00 | loss scale: 16384.0 | grad norm: 80620.713 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1465/ 159576 | consumed samples: 23440 | elapsed time per iteration (ms): 13567.3 | learning rate: 6.501E-06 | global batch size: 16 | lm loss: 7.328496E+00 | loss scale: 16384.0 | grad norm: 125622.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1466/ 159576 | consumed samples: 23456 | elapsed time per iteration (ms): 13597.3 | learning rate: 6.506E-06 | global batch size: 16 | lm loss: 7.289563E+00 | loss scale: 16384.0 | grad norm: 115928.663 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1467/ 159576 | consumed samples: 23472 | elapsed time per iteration (ms): 13941.8 | learning rate: 6.510E-06 | global batch size: 16 | lm loss: 7.383677E+00 | loss scale: 16384.0 | grad norm: 88787.769 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1468/ 159576 | consumed samples: 23488 | elapsed time per iteration (ms): 13557.9 | learning rate: 6.515E-06 | global batch size: 16 | lm loss: 7.200576E+00 | loss scale: 16384.0 | grad norm: 72136.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1469/ 159576 | consumed samples: 23504 | elapsed time per iteration (ms): 13659.8 | learning rate: 6.519E-06 | global batch size: 16 | lm loss: 7.237146E+00 | loss scale: 16384.0 | grad norm: 80384.892 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1470/ 159576 | consumed samples: 23520 | elapsed time per iteration (ms): 13520.5 | learning rate: 6.524E-06 | global batch size: 16 | lm loss: 7.087498E+00 | loss scale: 16384.0 | grad norm: 84910.064 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1471/ 159576 | consumed samples: 23536 | elapsed time per iteration (ms): 13587.4 | learning rate: 6.528E-06 | global batch size: 16 | lm loss: 7.201303E+00 | loss scale: 16384.0 | grad norm: 82344.270 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1472/ 159576 | consumed samples: 23552 | elapsed time per iteration (ms): 13785.3 | learning rate: 6.533E-06 | global batch size: 16 | lm loss: 7.099293E+00 | loss scale: 16384.0 | grad norm: 90694.938 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1473/ 159576 | consumed samples: 23568 | elapsed time per iteration (ms): 13564.5 | learning rate: 6.537E-06 | global batch size: 16 | lm loss: 7.241871E+00 | loss scale: 16384.0 | grad norm: 49829.478 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1474/ 159576 | consumed samples: 23584 | elapsed time per iteration (ms): 13624.0 | learning rate: 6.541E-06 | global batch size: 16 | lm loss: 7.157920E+00 | loss scale: 16384.0 | grad norm: 134064.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1475/ 159576 | consumed samples: 23600 | elapsed time per iteration (ms): 13651.2 | learning rate: 6.546E-06 | global batch size: 16 | lm loss: 7.214240E+00 | loss scale: 16384.0 | grad norm: 86872.151 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1476/ 159576 | consumed samples: 23616 | elapsed time per iteration (ms): 14166.8 | learning rate: 6.550E-06 | global batch size: 16 | lm loss: 7.192460E+00 | loss scale: 16384.0 | grad norm: 80848.938 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1477/ 159576 | consumed samples: 23632 | elapsed time per iteration (ms): 13604.7 | learning rate: 6.555E-06 | global batch size: 16 | lm loss: 7.323776E+00 | loss scale: 16384.0 | grad norm: 70454.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1478/ 159576 | consumed samples: 23648 | elapsed time per iteration (ms): 13572.6 | learning rate: 6.559E-06 | global batch size: 16 | lm loss: 7.268590E+00 | loss scale: 16384.0 | grad norm: 71693.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1479/ 159576 | consumed samples: 23664 | elapsed time per iteration (ms): 13608.6 | learning rate: 6.564E-06 | global batch size: 16 | lm loss: 7.296487E+00 | loss scale: 16384.0 | grad norm: 81654.087 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1480/ 159576 | consumed samples: 23680 | elapsed time per iteration (ms): 14039.7 | learning rate: 6.568E-06 | global batch size: 16 | lm loss: 7.090362E+00 | loss scale: 16384.0 | grad norm: 64201.153 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1481/ 159576 | consumed samples: 23696 | elapsed time per iteration (ms): 13583.2 | learning rate: 6.572E-06 | global batch size: 16 | lm loss: 7.375229E+00 | loss scale: 16384.0 | grad norm: 113007.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1482/ 159576 | consumed samples: 23712 | elapsed time per iteration (ms): 13660.9 | learning rate: 6.577E-06 | global batch size: 16 | lm loss: 7.293176E+00 | loss scale: 16384.0 | grad norm: 77498.464 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1483/ 159576 | consumed samples: 23728 | elapsed time per iteration (ms): 13614.0 | learning rate: 6.581E-06 | global batch size: 16 | lm loss: 7.336072E+00 | loss scale: 16384.0 | grad norm: 110912.409 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1484/ 159576 | consumed samples: 23744 | elapsed time per iteration (ms): 13566.7 | learning rate: 6.586E-06 | global batch size: 16 | lm loss: 7.364174E+00 | loss scale: 16384.0 | grad norm: 183688.896 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1485/ 159576 | consumed samples: 23760 | elapsed time per iteration (ms): 13815.4 | learning rate: 6.590E-06 | global batch size: 16 | lm loss: 7.239150E+00 | loss scale: 16384.0 | grad norm: 72249.353 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1486/ 159576 | consumed samples: 23776 | elapsed time per iteration (ms): 13589.6 | learning rate: 6.595E-06 | global batch size: 16 | lm loss: 7.200100E+00 | loss scale: 16384.0 | grad norm: 96228.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1487/ 159576 | consumed samples: 23792 | elapsed time per iteration (ms): 13607.7 | learning rate: 6.599E-06 | global batch size: 16 | lm loss: 7.292061E+00 | loss scale: 16384.0 | grad norm: 121424.509 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1488/ 159576 | consumed samples: 23808 | elapsed time per iteration (ms): 13632.1 | learning rate: 6.604E-06 | global batch size: 16 | lm loss: 7.136326E+00 | loss scale: 16384.0 | grad norm: 126581.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1489/ 159576 | consumed samples: 23824 | elapsed time per iteration (ms): 14024.4 | learning rate: 6.608E-06 | global batch size: 16 | lm loss: 7.314082E+00 | loss scale: 16384.0 | grad norm: 81672.303 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1490/ 159576 | consumed samples: 23840 | elapsed time per iteration (ms): 13562.3 | learning rate: 6.612E-06 | global batch size: 16 | lm loss: 7.220848E+00 | loss scale: 16384.0 | grad norm: 124864.436 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1491/ 159576 | consumed samples: 23856 | elapsed time per iteration (ms): 13573.1 | learning rate: 6.617E-06 | global batch size: 16 | lm loss: 7.139018E+00 | loss scale: 16384.0 | grad norm: 91430.675 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1492/ 159576 | consumed samples: 23872 | elapsed time per iteration (ms): 13614.3 | learning rate: 6.621E-06 | global batch size: 16 | lm loss: 7.268013E+00 | loss scale: 16384.0 | grad norm: 135716.036 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1493/ 159576 | consumed samples: 23888 | elapsed time per iteration (ms): 13616.6 | learning rate: 6.626E-06 | global batch size: 16 | lm loss: 7.252588E+00 | loss scale: 16384.0 | grad norm: 83740.306 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1494/ 159576 | consumed samples: 23904 | elapsed time per iteration (ms): 13959.7 | learning rate: 6.630E-06 | global batch size: 16 | lm loss: 6.975100E+00 | loss scale: 16384.0 | grad norm: 83284.334 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1495/ 159576 | consumed samples: 23920 | elapsed time per iteration (ms): 13605.9 | learning rate: 6.635E-06 | global batch size: 16 | lm loss: 7.372656E+00 | loss scale: 16384.0 | grad norm: 69225.313 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1496/ 159576 | consumed samples: 23936 | elapsed time per iteration (ms): 13623.3 | learning rate: 6.639E-06 | global batch size: 16 | lm loss: 7.219198E+00 | loss scale: 16384.0 | grad norm: 115429.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1497/ 159576 | consumed samples: 23952 | elapsed time per iteration (ms): 13627.9 | learning rate: 6.643E-06 | global batch size: 16 | lm loss: 7.340521E+00 | loss scale: 16384.0 | grad norm: 85290.242 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1498/ 159576 | consumed samples: 23968 | elapsed time per iteration (ms): 13884.3 | learning rate: 6.648E-06 | global batch size: 16 | lm loss: 7.186238E+00 | loss scale: 16384.0 | grad norm: 114903.673 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1499/ 159576 | consumed samples: 23984 | elapsed time per iteration (ms): 13657.7 | learning rate: 6.652E-06 | global batch size: 16 | lm loss: 7.208917E+00 | loss scale: 16384.0 | grad norm: 92023.782 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1500/ 159576 | consumed samples: 24000 | elapsed time per iteration (ms): 13609.5 | learning rate: 6.657E-06 | global batch size: 16 | lm loss: 7.191697E+00 | loss scale: 32768.0 | grad norm: 96438.628 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 1500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-24 08:02:02,260] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step1500/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 1500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 17270.67 - iteration 1501/ 159576 | consumed samples: 24016 | elapsed time per iteration (ms): 31402.6 | learning rate: 6.661E-06 | global batch size: 16 | lm loss: 7.276592E+00 | loss scale: 32768.0 | grad norm: 161966.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1502/ 159576 | consumed samples: 24032 | elapsed time per iteration (ms): 13594.3 | learning rate: 6.666E-06 | global batch size: 16 | lm loss: 7.318794E+00 | loss scale: 32768.0 | grad norm: 194567.067 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1503/ 159576 | consumed samples: 24048 | elapsed time per iteration (ms): 13587.2 | learning rate: 6.670E-06 | global batch size: 16 | lm loss: 7.168730E+00 | loss scale: 32768.0 | grad norm: 147316.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1504/ 159576 | consumed samples: 24064 | elapsed time per iteration (ms): 13690.3 | learning rate: 6.675E-06 | global batch size: 16 | lm loss: 7.199265E+00 | loss scale: 32768.0 | grad norm: 160502.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1505/ 159576 | consumed samples: 24080 | elapsed time per iteration (ms): 14065.5 | learning rate: 6.679E-06 | global batch size: 16 | lm loss: 7.004994E+00 | loss scale: 32768.0 | grad norm: 126147.516 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1506/ 159576 | consumed samples: 24096 | elapsed time per iteration (ms): 13542.1 | learning rate: 6.683E-06 | global batch size: 16 | lm loss: 7.322471E+00 | loss scale: 32768.0 | grad norm: 196683.898 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1507/ 159576 | consumed samples: 24112 | elapsed time per iteration (ms): 13669.0 | learning rate: 6.688E-06 | global batch size: 16 | lm loss: 7.393982E+00 | loss scale: 32768.0 | grad norm: 190898.758 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 08:03:56] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1162855_[2-10%1] on 'gpu_p13' partition) -[2021-09-24 08:03:56] PULSE: tr8-104B is running for 2:11:45 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 1508/ 159576 | consumed samples: 24128 | elapsed time per iteration (ms): 13530.1 | learning rate: 6.692E-06 | global batch size: 16 | lm loss: 7.303823E+00 | loss scale: 32768.0 | grad norm: 138876.766 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1509/ 159576 | consumed samples: 24144 | elapsed time per iteration (ms): 13620.2 | learning rate: 6.697E-06 | global batch size: 16 | lm loss: 7.181733E+00 | loss scale: 32768.0 | grad norm: 245330.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1510/ 159576 | consumed samples: 24160 | elapsed time per iteration (ms): 13857.7 | learning rate: 6.701E-06 | global batch size: 16 | lm loss: 7.249762E+00 | loss scale: 32768.0 | grad norm: 178346.781 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1511/ 159576 | consumed samples: 24176 | elapsed time per iteration (ms): 13642.0 | learning rate: 6.706E-06 | global batch size: 16 | lm loss: 7.141682E+00 | loss scale: 32768.0 | grad norm: 225502.316 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1512/ 159576 | consumed samples: 24192 | elapsed time per iteration (ms): 13680.2 | learning rate: 6.710E-06 | global batch size: 16 | lm loss: 7.262461E+00 | loss scale: 32768.0 | grad norm: 152013.376 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1513/ 159576 | consumed samples: 24208 | elapsed time per iteration (ms): 6867.5 | learning rate: 6.710E-06 | global batch size: 16 | lm loss: 7.117817E+00 | loss scale: 32768.0 | grad norm: 152013.376 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1514/ 159576 | consumed samples: 24224 | elapsed time per iteration (ms): 13192.9 | learning rate: 6.714E-06 | global batch size: 16 | lm loss: 7.508438E+00 | loss scale: 32768.0 | grad norm: 277772.591 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1515/ 159576 | consumed samples: 24240 | elapsed time per iteration (ms): 13697.2 | learning rate: 6.719E-06 | global batch size: 16 | lm loss: 7.055306E+00 | loss scale: 32768.0 | grad norm: 184291.975 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1516/ 159576 | consumed samples: 24256 | elapsed time per iteration (ms): 13601.8 | learning rate: 6.723E-06 | global batch size: 16 | lm loss: 7.364224E+00 | loss scale: 32768.0 | grad norm: 153076.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1517/ 159576 | consumed samples: 24272 | elapsed time per iteration (ms): 13603.6 | learning rate: 6.728E-06 | global batch size: 16 | lm loss: 6.912699E+00 | loss scale: 32768.0 | grad norm: 218098.104 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1518/ 159576 | consumed samples: 24288 | elapsed time per iteration (ms): 13640.7 | learning rate: 6.732E-06 | global batch size: 16 | lm loss: 7.323909E+00 | loss scale: 32768.0 | grad norm: 216972.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1519/ 159576 | consumed samples: 24304 | elapsed time per iteration (ms): 14045.8 | learning rate: 6.737E-06 | global batch size: 16 | lm loss: 7.068207E+00 | loss scale: 32768.0 | grad norm: 118810.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1520/ 159576 | consumed samples: 24320 | elapsed time per iteration (ms): 13595.0 | learning rate: 6.741E-06 | global batch size: 16 | lm loss: 7.160398E+00 | loss scale: 32768.0 | grad norm: 174748.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1521/ 159576 | consumed samples: 24336 | elapsed time per iteration (ms): 13611.5 | learning rate: 6.746E-06 | global batch size: 16 | lm loss: 7.170628E+00 | loss scale: 32768.0 | grad norm: 146800.781 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1522/ 159576 | consumed samples: 24352 | elapsed time per iteration (ms): 13576.3 | learning rate: 6.750E-06 | global batch size: 16 | lm loss: 7.141685E+00 | loss scale: 32768.0 | grad norm: 301970.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1523/ 159576 | consumed samples: 24368 | elapsed time per iteration (ms): 13818.0 | learning rate: 6.754E-06 | global batch size: 16 | lm loss: 7.351134E+00 | loss scale: 32768.0 | grad norm: 203560.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1524/ 159576 | consumed samples: 24384 | elapsed time per iteration (ms): 13700.8 | learning rate: 6.759E-06 | global batch size: 16 | lm loss: 7.291396E+00 | loss scale: 32768.0 | grad norm: 186296.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1525/ 159576 | consumed samples: 24400 | elapsed time per iteration (ms): 13611.8 | learning rate: 6.763E-06 | global batch size: 16 | lm loss: 7.052688E+00 | loss scale: 32768.0 | grad norm: 186235.227 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1526/ 159576 | consumed samples: 24416 | elapsed time per iteration (ms): 13626.5 | learning rate: 6.768E-06 | global batch size: 16 | lm loss: 7.083735E+00 | loss scale: 32768.0 | grad norm: 254298.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1527/ 159576 | consumed samples: 24432 | elapsed time per iteration (ms): 13677.9 | learning rate: 6.772E-06 | global batch size: 16 | lm loss: 7.212967E+00 | loss scale: 32768.0 | grad norm: 290009.050 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1528/ 159576 | consumed samples: 24448 | elapsed time per iteration (ms): 13998.5 | learning rate: 6.777E-06 | global batch size: 16 | lm loss: 7.249606E+00 | loss scale: 32768.0 | grad norm: 193082.466 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1529/ 159576 | consumed samples: 24464 | elapsed time per iteration (ms): 13543.2 | learning rate: 6.781E-06 | global batch size: 16 | lm loss: 7.187498E+00 | loss scale: 32768.0 | grad norm: 161368.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1530/ 159576 | consumed samples: 24480 | elapsed time per iteration (ms): 13565.1 | learning rate: 6.786E-06 | global batch size: 16 | lm loss: 7.266234E+00 | loss scale: 32768.0 | grad norm: 198639.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1531/ 159576 | consumed samples: 24496 | elapsed time per iteration (ms): 13571.4 | learning rate: 6.790E-06 | global batch size: 16 | lm loss: 7.528541E+00 | loss scale: 32768.0 | grad norm: 545404.395 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1532/ 159576 | consumed samples: 24512 | elapsed time per iteration (ms): 13970.0 | learning rate: 6.794E-06 | global batch size: 16 | lm loss: 7.212701E+00 | loss scale: 32768.0 | grad norm: 227881.927 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1533/ 159576 | consumed samples: 24528 | elapsed time per iteration (ms): 13566.3 | learning rate: 6.799E-06 | global batch size: 16 | lm loss: 7.440462E+00 | loss scale: 32768.0 | grad norm: 170454.067 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1534/ 159576 | consumed samples: 24544 | elapsed time per iteration (ms): 13611.2 | learning rate: 6.803E-06 | global batch size: 16 | lm loss: 7.264073E+00 | loss scale: 32768.0 | grad norm: 306199.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1535/ 159576 | consumed samples: 24560 | elapsed time per iteration (ms): 13661.5 | learning rate: 6.808E-06 | global batch size: 16 | lm loss: 7.109380E+00 | loss scale: 32768.0 | grad norm: 130108.699 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1536/ 159576 | consumed samples: 24576 | elapsed time per iteration (ms): 13539.1 | learning rate: 6.812E-06 | global batch size: 16 | lm loss: 7.475006E+00 | loss scale: 32768.0 | grad norm: 447958.462 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1537/ 159576 | consumed samples: 24592 | elapsed time per iteration (ms): 13698.1 | learning rate: 6.817E-06 | global batch size: 16 | lm loss: 7.372583E+00 | loss scale: 32768.0 | grad norm: 233240.316 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1538/ 159576 | consumed samples: 24608 | elapsed time per iteration (ms): 13601.5 | learning rate: 6.821E-06 | global batch size: 16 | lm loss: 7.208574E+00 | loss scale: 32768.0 | grad norm: 208866.404 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1539/ 159576 | consumed samples: 24624 | elapsed time per iteration (ms): 13645.6 | learning rate: 6.825E-06 | global batch size: 16 | lm loss: 7.209548E+00 | loss scale: 32768.0 | grad norm: 290418.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1540/ 159576 | consumed samples: 24640 | elapsed time per iteration (ms): 13628.1 | learning rate: 6.830E-06 | global batch size: 16 | lm loss: 7.168006E+00 | loss scale: 32768.0 | grad norm: 271187.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1541/ 159576 | consumed samples: 24656 | elapsed time per iteration (ms): 14103.2 | learning rate: 6.834E-06 | global batch size: 16 | lm loss: 7.235812E+00 | loss scale: 32768.0 | grad norm: 368637.293 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1542/ 159576 | consumed samples: 24672 | elapsed time per iteration (ms): 13752.7 | learning rate: 6.839E-06 | global batch size: 16 | lm loss: 7.205466E+00 | loss scale: 32768.0 | grad norm: 275606.149 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1543/ 159576 | consumed samples: 24688 | elapsed time per iteration (ms): 13526.0 | learning rate: 6.843E-06 | global batch size: 16 | lm loss: 7.152663E+00 | loss scale: 32768.0 | grad norm: 186385.977 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1544/ 159576 | consumed samples: 24704 | elapsed time per iteration (ms): 13591.1 | learning rate: 6.848E-06 | global batch size: 16 | lm loss: 7.402153E+00 | loss scale: 32768.0 | grad norm: 202784.884 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1545/ 159576 | consumed samples: 24720 | elapsed time per iteration (ms): 13853.8 | learning rate: 6.852E-06 | global batch size: 16 | lm loss: 7.254861E+00 | loss scale: 32768.0 | grad norm: 302847.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1546/ 159576 | consumed samples: 24736 | elapsed time per iteration (ms): 13718.3 | learning rate: 6.857E-06 | global batch size: 16 | lm loss: 7.259928E+00 | loss scale: 32768.0 | grad norm: 190927.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1547/ 159576 | consumed samples: 24752 | elapsed time per iteration (ms): 13565.0 | learning rate: 6.861E-06 | global batch size: 16 | lm loss: 7.226044E+00 | loss scale: 32768.0 | grad norm: 147732.617 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1548/ 159576 | consumed samples: 24768 | elapsed time per iteration (ms): 13562.3 | learning rate: 6.865E-06 | global batch size: 16 | lm loss: 7.106945E+00 | loss scale: 32768.0 | grad norm: 275364.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1549/ 159576 | consumed samples: 24784 | elapsed time per iteration (ms): 13573.3 | learning rate: 6.870E-06 | global batch size: 16 | lm loss: 7.157021E+00 | loss scale: 32768.0 | grad norm: 180244.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1550/ 159576 | consumed samples: 24800 | elapsed time per iteration (ms): 13916.8 | learning rate: 6.874E-06 | global batch size: 16 | lm loss: 7.001479E+00 | loss scale: 32768.0 | grad norm: 268566.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1551/ 159576 | consumed samples: 24816 | elapsed time per iteration (ms): 13651.8 | learning rate: 6.879E-06 | global batch size: 16 | lm loss: 7.167608E+00 | loss scale: 32768.0 | grad norm: 198735.053 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1552/ 159576 | consumed samples: 24832 | elapsed time per iteration (ms): 13608.0 | learning rate: 6.883E-06 | global batch size: 16 | lm loss: 7.093953E+00 | loss scale: 32768.0 | grad norm: 170933.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1553/ 159576 | consumed samples: 24848 | elapsed time per iteration (ms): 13517.6 | learning rate: 6.888E-06 | global batch size: 16 | lm loss: 7.234317E+00 | loss scale: 32768.0 | grad norm: 237231.760 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1554/ 159576 | consumed samples: 24864 | elapsed time per iteration (ms): 14011.1 | learning rate: 6.892E-06 | global batch size: 16 | lm loss: 7.130560E+00 | loss scale: 32768.0 | grad norm: 237902.373 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1555/ 159576 | consumed samples: 24880 | elapsed time per iteration (ms): 13510.9 | learning rate: 6.896E-06 | global batch size: 16 | lm loss: 7.275712E+00 | loss scale: 32768.0 | grad norm: 149656.891 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1556/ 159576 | consumed samples: 24896 | elapsed time per iteration (ms): 13617.0 | learning rate: 6.901E-06 | global batch size: 16 | lm loss: 7.239087E+00 | loss scale: 32768.0 | grad norm: 186987.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1557/ 159576 | consumed samples: 24912 | elapsed time per iteration (ms): 13622.7 | learning rate: 6.905E-06 | global batch size: 16 | lm loss: 6.972548E+00 | loss scale: 32768.0 | grad norm: 167404.940 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1558/ 159576 | consumed samples: 24928 | elapsed time per iteration (ms): 13629.7 | learning rate: 6.910E-06 | global batch size: 16 | lm loss: 7.274665E+00 | loss scale: 32768.0 | grad norm: 170409.995 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1559/ 159576 | consumed samples: 24944 | elapsed time per iteration (ms): 13856.8 | learning rate: 6.914E-06 | global batch size: 16 | lm loss: 7.320499E+00 | loss scale: 32768.0 | grad norm: 139509.403 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1560/ 159576 | consumed samples: 24960 | elapsed time per iteration (ms): 13572.0 | learning rate: 6.919E-06 | global batch size: 16 | lm loss: 7.481147E+00 | loss scale: 32768.0 | grad norm: 204961.182 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1561/ 159576 | consumed samples: 24976 | elapsed time per iteration (ms): 13609.9 | learning rate: 6.923E-06 | global batch size: 16 | lm loss: 7.318799E+00 | loss scale: 32768.0 | grad norm: 233741.215 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1562/ 159576 | consumed samples: 24992 | elapsed time per iteration (ms): 13593.5 | learning rate: 6.928E-06 | global batch size: 16 | lm loss: 6.970228E+00 | loss scale: 32768.0 | grad norm: 159417.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1563/ 159576 | consumed samples: 25008 | elapsed time per iteration (ms): 13894.7 | learning rate: 6.932E-06 | global batch size: 16 | lm loss: 7.266310E+00 | loss scale: 32768.0 | grad norm: 154081.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1564/ 159576 | consumed samples: 25024 | elapsed time per iteration (ms): 13687.0 | learning rate: 6.936E-06 | global batch size: 16 | lm loss: 7.274476E+00 | loss scale: 32768.0 | grad norm: 258666.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1565/ 159576 | consumed samples: 25040 | elapsed time per iteration (ms): 13663.3 | learning rate: 6.941E-06 | global batch size: 16 | lm loss: 7.125623E+00 | loss scale: 32768.0 | grad norm: 167968.329 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1566/ 159576 | consumed samples: 25056 | elapsed time per iteration (ms): 13604.1 | learning rate: 6.945E-06 | global batch size: 16 | lm loss: 7.210727E+00 | loss scale: 32768.0 | grad norm: 198543.646 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1567/ 159576 | consumed samples: 25072 | elapsed time per iteration (ms): 14015.2 | learning rate: 6.950E-06 | global batch size: 16 | lm loss: 7.245472E+00 | loss scale: 32768.0 | grad norm: 149711.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1568/ 159576 | consumed samples: 25088 | elapsed time per iteration (ms): 13524.3 | learning rate: 6.954E-06 | global batch size: 16 | lm loss: 6.959779E+00 | loss scale: 32768.0 | grad norm: 217321.763 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1569/ 159576 | consumed samples: 25104 | elapsed time per iteration (ms): 13601.8 | learning rate: 6.959E-06 | global batch size: 16 | lm loss: 7.177199E+00 | loss scale: 32768.0 | grad norm: 254297.194 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1570/ 159576 | consumed samples: 25120 | elapsed time per iteration (ms): 13589.9 | learning rate: 6.963E-06 | global batch size: 16 | lm loss: 7.113214E+00 | loss scale: 32768.0 | grad norm: 172729.515 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1571/ 159576 | consumed samples: 25136 | elapsed time per iteration (ms): 13658.1 | learning rate: 6.967E-06 | global batch size: 16 | lm loss: 7.054616E+00 | loss scale: 32768.0 | grad norm: 176859.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1572/ 159576 | consumed samples: 25152 | elapsed time per iteration (ms): 13798.6 | learning rate: 6.972E-06 | global batch size: 16 | lm loss: 7.111713E+00 | loss scale: 32768.0 | grad norm: 165282.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1573/ 159576 | consumed samples: 25168 | elapsed time per iteration (ms): 13684.6 | learning rate: 6.976E-06 | global batch size: 16 | lm loss: 7.324330E+00 | loss scale: 32768.0 | grad norm: 205395.896 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1574/ 159576 | consumed samples: 25184 | elapsed time per iteration (ms): 13612.3 | learning rate: 6.981E-06 | global batch size: 16 | lm loss: 7.139562E+00 | loss scale: 32768.0 | grad norm: 201180.686 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1575/ 159576 | consumed samples: 25200 | elapsed time per iteration (ms): 13567.2 | learning rate: 6.985E-06 | global batch size: 16 | lm loss: 7.063004E+00 | loss scale: 32768.0 | grad norm: 126181.509 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1576/ 159576 | consumed samples: 25216 | elapsed time per iteration (ms): 13982.4 | learning rate: 6.990E-06 | global batch size: 16 | lm loss: 7.030066E+00 | loss scale: 32768.0 | grad norm: 261758.694 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1577/ 159576 | consumed samples: 25232 | elapsed time per iteration (ms): 13552.2 | learning rate: 6.994E-06 | global batch size: 16 | lm loss: 7.129750E+00 | loss scale: 32768.0 | grad norm: 133747.300 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1578/ 159576 | consumed samples: 25248 | elapsed time per iteration (ms): 13576.0 | learning rate: 6.999E-06 | global batch size: 16 | lm loss: 7.478085E+00 | loss scale: 32768.0 | grad norm: 193421.594 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1579/ 159576 | consumed samples: 25264 | elapsed time per iteration (ms): 13627.7 | learning rate: 7.003E-06 | global batch size: 16 | lm loss: 7.062607E+00 | loss scale: 32768.0 | grad norm: 162309.186 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1580/ 159576 | consumed samples: 25280 | elapsed time per iteration (ms): 13870.0 | learning rate: 7.007E-06 | global batch size: 16 | lm loss: 6.734056E+00 | loss scale: 32768.0 | grad norm: 233732.101 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1581/ 159576 | consumed samples: 25296 | elapsed time per iteration (ms): 13680.5 | learning rate: 7.012E-06 | global batch size: 16 | lm loss: 7.360079E+00 | loss scale: 32768.0 | grad norm: 189405.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1582/ 159576 | consumed samples: 25312 | elapsed time per iteration (ms): 13679.9 | learning rate: 7.016E-06 | global batch size: 16 | lm loss: 7.291443E+00 | loss scale: 32768.0 | grad norm: 159639.849 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1583/ 159576 | consumed samples: 25328 | elapsed time per iteration (ms): 13579.9 | learning rate: 7.021E-06 | global batch size: 16 | lm loss: 7.361541E+00 | loss scale: 32768.0 | grad norm: 178947.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1584/ 159576 | consumed samples: 25344 | elapsed time per iteration (ms): 13614.6 | learning rate: 7.025E-06 | global batch size: 16 | lm loss: 7.145397E+00 | loss scale: 32768.0 | grad norm: 198293.827 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1585/ 159576 | consumed samples: 25360 | elapsed time per iteration (ms): 13943.5 | learning rate: 7.030E-06 | global batch size: 16 | lm loss: 7.009763E+00 | loss scale: 32768.0 | grad norm: 172995.962 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1586/ 159576 | consumed samples: 25376 | elapsed time per iteration (ms): 13665.6 | learning rate: 7.034E-06 | global batch size: 16 | lm loss: 7.306109E+00 | loss scale: 32768.0 | grad norm: 193555.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1587/ 159576 | consumed samples: 25392 | elapsed time per iteration (ms): 13713.0 | learning rate: 7.038E-06 | global batch size: 16 | lm loss: 7.341703E+00 | loss scale: 32768.0 | grad norm: 240981.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1588/ 159576 | consumed samples: 25408 | elapsed time per iteration (ms): 13685.0 | learning rate: 7.043E-06 | global batch size: 16 | lm loss: 7.076401E+00 | loss scale: 32768.0 | grad norm: 144170.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1589/ 159576 | consumed samples: 25424 | elapsed time per iteration (ms): 13990.2 | learning rate: 7.047E-06 | global batch size: 16 | lm loss: 7.016201E+00 | loss scale: 32768.0 | grad norm: 215101.083 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1590/ 159576 | consumed samples: 25440 | elapsed time per iteration (ms): 13615.2 | learning rate: 7.052E-06 | global batch size: 16 | lm loss: 7.248097E+00 | loss scale: 32768.0 | grad norm: 183674.866 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1591/ 159576 | consumed samples: 25456 | elapsed time per iteration (ms): 13603.7 | learning rate: 7.056E-06 | global batch size: 16 | lm loss: 7.274388E+00 | loss scale: 32768.0 | grad norm: 194912.772 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1592/ 159576 | consumed samples: 25472 | elapsed time per iteration (ms): 13589.1 | learning rate: 7.061E-06 | global batch size: 16 | lm loss: 7.189001E+00 | loss scale: 32768.0 | grad norm: 178991.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1593/ 159576 | consumed samples: 25488 | elapsed time per iteration (ms): 13610.8 | learning rate: 7.065E-06 | global batch size: 16 | lm loss: 7.232603E+00 | loss scale: 32768.0 | grad norm: 152962.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1594/ 159576 | consumed samples: 25504 | elapsed time per iteration (ms): 13768.0 | learning rate: 7.070E-06 | global batch size: 16 | lm loss: 7.102697E+00 | loss scale: 32768.0 | grad norm: 144835.907 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1595/ 159576 | consumed samples: 25520 | elapsed time per iteration (ms): 13616.0 | learning rate: 7.074E-06 | global batch size: 16 | lm loss: 7.124231E+00 | loss scale: 32768.0 | grad norm: 492597.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1596/ 159576 | consumed samples: 25536 | elapsed time per iteration (ms): 13671.0 | learning rate: 7.078E-06 | global batch size: 16 | lm loss: 7.347673E+00 | loss scale: 32768.0 | grad norm: 283986.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1597/ 159576 | consumed samples: 25552 | elapsed time per iteration (ms): 13618.5 | learning rate: 7.083E-06 | global batch size: 16 | lm loss: 7.247316E+00 | loss scale: 32768.0 | grad norm: 185319.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1598/ 159576 | consumed samples: 25568 | elapsed time per iteration (ms): 14074.4 | learning rate: 7.087E-06 | global batch size: 16 | lm loss: 7.152137E+00 | loss scale: 32768.0 | grad norm: 179820.746 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1599/ 159576 | consumed samples: 25584 | elapsed time per iteration (ms): 13609.5 | learning rate: 7.092E-06 | global batch size: 16 | lm loss: 7.087896E+00 | loss scale: 32768.0 | grad norm: 178653.073 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1600/ 159576 | consumed samples: 25600 | elapsed time per iteration (ms): 13606.5 | learning rate: 7.096E-06 | global batch size: 16 | lm loss: 7.094335E+00 | loss scale: 32768.0 | grad norm: 197442.311 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1601/ 159576 | consumed samples: 25616 | elapsed time per iteration (ms): 13605.3 | learning rate: 7.101E-06 | global batch size: 16 | lm loss: 7.230387E+00 | loss scale: 32768.0 | grad norm: 277453.177 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1602/ 159576 | consumed samples: 25632 | elapsed time per iteration (ms): 14026.8 | learning rate: 7.105E-06 | global batch size: 16 | lm loss: 7.399794E+00 | loss scale: 32768.0 | grad norm: 202190.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1603/ 159576 | consumed samples: 25648 | elapsed time per iteration (ms): 13782.5 | learning rate: 7.109E-06 | global batch size: 16 | lm loss: 7.261839E+00 | loss scale: 32768.0 | grad norm: 162395.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1604/ 159576 | consumed samples: 25664 | elapsed time per iteration (ms): 13652.4 | learning rate: 7.114E-06 | global batch size: 16 | lm loss: 7.202652E+00 | loss scale: 32768.0 | grad norm: 199798.347 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1605/ 159576 | consumed samples: 25680 | elapsed time per iteration (ms): 13537.9 | learning rate: 7.118E-06 | global batch size: 16 | lm loss: 7.002069E+00 | loss scale: 32768.0 | grad norm: 200932.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1606/ 159576 | consumed samples: 25696 | elapsed time per iteration (ms): 13623.9 | learning rate: 7.123E-06 | global batch size: 16 | lm loss: 6.994870E+00 | loss scale: 32768.0 | grad norm: 182105.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1607/ 159576 | consumed samples: 25712 | elapsed time per iteration (ms): 13778.9 | learning rate: 7.127E-06 | global batch size: 16 | lm loss: 7.236290E+00 | loss scale: 32768.0 | grad norm: 210525.575 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1608/ 159576 | consumed samples: 25728 | elapsed time per iteration (ms): 13614.0 | learning rate: 7.132E-06 | global batch size: 16 | lm loss: 7.271640E+00 | loss scale: 32768.0 | grad norm: 155104.364 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1609/ 159576 | consumed samples: 25744 | elapsed time per iteration (ms): 13637.4 | learning rate: 7.136E-06 | global batch size: 16 | lm loss: 7.142178E+00 | loss scale: 32768.0 | grad norm: 179013.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1610/ 159576 | consumed samples: 25760 | elapsed time per iteration (ms): 13663.2 | learning rate: 7.141E-06 | global batch size: 16 | lm loss: 7.233703E+00 | loss scale: 32768.0 | grad norm: 205415.974 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1611/ 159576 | consumed samples: 25776 | elapsed time per iteration (ms): 14078.6 | learning rate: 7.145E-06 | global batch size: 16 | lm loss: 7.137359E+00 | loss scale: 32768.0 | grad norm: 211115.165 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1612/ 159576 | consumed samples: 25792 | elapsed time per iteration (ms): 13476.7 | learning rate: 7.149E-06 | global batch size: 16 | lm loss: 7.265315E+00 | loss scale: 32768.0 | grad norm: 221323.191 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1613/ 159576 | consumed samples: 25808 | elapsed time per iteration (ms): 13601.4 | learning rate: 7.154E-06 | global batch size: 16 | lm loss: 7.092045E+00 | loss scale: 32768.0 | grad norm: 157009.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1614/ 159576 | consumed samples: 25824 | elapsed time per iteration (ms): 13616.6 | learning rate: 7.158E-06 | global batch size: 16 | lm loss: 7.018819E+00 | loss scale: 32768.0 | grad norm: 198533.340 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1615/ 159576 | consumed samples: 25840 | elapsed time per iteration (ms): 13623.7 | learning rate: 7.163E-06 | global batch size: 16 | lm loss: 7.280205E+00 | loss scale: 32768.0 | grad norm: 288417.013 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1616/ 159576 | consumed samples: 25856 | elapsed time per iteration (ms): 13877.9 | learning rate: 7.167E-06 | global batch size: 16 | lm loss: 7.224732E+00 | loss scale: 32768.0 | grad norm: 186062.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1617/ 159576 | consumed samples: 25872 | elapsed time per iteration (ms): 13663.6 | learning rate: 7.172E-06 | global batch size: 16 | lm loss: 7.238441E+00 | loss scale: 32768.0 | grad norm: 168294.596 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1618/ 159576 | consumed samples: 25888 | elapsed time per iteration (ms): 13675.4 | learning rate: 7.176E-06 | global batch size: 16 | lm loss: 7.159503E+00 | loss scale: 32768.0 | grad norm: 181012.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1619/ 159576 | consumed samples: 25904 | elapsed time per iteration (ms): 13559.3 | learning rate: 7.180E-06 | global batch size: 16 | lm loss: 7.125117E+00 | loss scale: 32768.0 | grad norm: 156261.868 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1620/ 159576 | consumed samples: 25920 | elapsed time per iteration (ms): 14141.4 | learning rate: 7.185E-06 | global batch size: 16 | lm loss: 7.312489E+00 | loss scale: 32768.0 | grad norm: 501804.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1621/ 159576 | consumed samples: 25936 | elapsed time per iteration (ms): 13619.8 | learning rate: 7.189E-06 | global batch size: 16 | lm loss: 7.144738E+00 | loss scale: 32768.0 | grad norm: 187512.417 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1622/ 159576 | consumed samples: 25952 | elapsed time per iteration (ms): 13623.1 | learning rate: 7.194E-06 | global batch size: 16 | lm loss: 7.036147E+00 | loss scale: 32768.0 | grad norm: 185668.156 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1623/ 159576 | consumed samples: 25968 | elapsed time per iteration (ms): 13626.1 | learning rate: 7.198E-06 | global batch size: 16 | lm loss: 6.981637E+00 | loss scale: 32768.0 | grad norm: 194478.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1624/ 159576 | consumed samples: 25984 | elapsed time per iteration (ms): 13916.5 | learning rate: 7.203E-06 | global batch size: 16 | lm loss: 7.098595E+00 | loss scale: 32768.0 | grad norm: 176876.504 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1625/ 159576 | consumed samples: 26000 | elapsed time per iteration (ms): 13897.1 | learning rate: 7.207E-06 | global batch size: 16 | lm loss: 7.024785E+00 | loss scale: 32768.0 | grad norm: 133422.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1626/ 159576 | consumed samples: 26016 | elapsed time per iteration (ms): 13553.3 | learning rate: 7.212E-06 | global batch size: 16 | lm loss: 7.101878E+00 | loss scale: 32768.0 | grad norm: 187471.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1627/ 159576 | consumed samples: 26032 | elapsed time per iteration (ms): 13608.6 | learning rate: 7.216E-06 | global batch size: 16 | lm loss: 7.083658E+00 | loss scale: 32768.0 | grad norm: 163022.597 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1628/ 159576 | consumed samples: 26048 | elapsed time per iteration (ms): 13598.7 | learning rate: 7.220E-06 | global batch size: 16 | lm loss: 7.128680E+00 | loss scale: 32768.0 | grad norm: 227341.519 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1629/ 159576 | consumed samples: 26064 | elapsed time per iteration (ms): 13737.0 | learning rate: 7.225E-06 | global batch size: 16 | lm loss: 7.226182E+00 | loss scale: 32768.0 | grad norm: 173557.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1630/ 159576 | consumed samples: 26080 | elapsed time per iteration (ms): 13598.4 | learning rate: 7.229E-06 | global batch size: 16 | lm loss: 7.204190E+00 | loss scale: 32768.0 | grad norm: 194336.283 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1631/ 159576 | consumed samples: 26096 | elapsed time per iteration (ms): 13618.5 | learning rate: 7.234E-06 | global batch size: 16 | lm loss: 7.295867E+00 | loss scale: 32768.0 | grad norm: 218111.651 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1632/ 159576 | consumed samples: 26112 | elapsed time per iteration (ms): 13608.1 | learning rate: 7.238E-06 | global batch size: 16 | lm loss: 7.313629E+00 | loss scale: 32768.0 | grad norm: 150755.205 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1633/ 159576 | consumed samples: 26128 | elapsed time per iteration (ms): 13926.3 | learning rate: 7.243E-06 | global batch size: 16 | lm loss: 7.105534E+00 | loss scale: 32768.0 | grad norm: 416417.348 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1634/ 159576 | consumed samples: 26144 | elapsed time per iteration (ms): 13573.4 | learning rate: 7.247E-06 | global batch size: 16 | lm loss: 7.154237E+00 | loss scale: 32768.0 | grad norm: 222886.895 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1635/ 159576 | consumed samples: 26160 | elapsed time per iteration (ms): 13613.9 | learning rate: 7.251E-06 | global batch size: 16 | lm loss: 7.367383E+00 | loss scale: 32768.0 | grad norm: 198928.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1636/ 159576 | consumed samples: 26176 | elapsed time per iteration (ms): 13620.0 | learning rate: 7.256E-06 | global batch size: 16 | lm loss: 7.224826E+00 | loss scale: 32768.0 | grad norm: 190490.724 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1637/ 159576 | consumed samples: 26192 | elapsed time per iteration (ms): 13847.4 | learning rate: 7.260E-06 | global batch size: 16 | lm loss: 7.133263E+00 | loss scale: 32768.0 | grad norm: 335044.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1638/ 159576 | consumed samples: 26208 | elapsed time per iteration (ms): 13680.4 | learning rate: 7.265E-06 | global batch size: 16 | lm loss: 6.991650E+00 | loss scale: 32768.0 | grad norm: 351935.284 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1639/ 159576 | consumed samples: 26224 | elapsed time per iteration (ms): 13603.3 | learning rate: 7.269E-06 | global batch size: 16 | lm loss: 7.261710E+00 | loss scale: 32768.0 | grad norm: 162679.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1640/ 159576 | consumed samples: 26240 | elapsed time per iteration (ms): 13643.0 | learning rate: 7.274E-06 | global batch size: 16 | lm loss: 7.243075E+00 | loss scale: 32768.0 | grad norm: 139259.853 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1641/ 159576 | consumed samples: 26256 | elapsed time per iteration (ms): 13685.4 | learning rate: 7.278E-06 | global batch size: 16 | lm loss: 7.347486E+00 | loss scale: 32768.0 | grad norm: 190145.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1642/ 159576 | consumed samples: 26272 | elapsed time per iteration (ms): 13709.0 | learning rate: 7.283E-06 | global batch size: 16 | lm loss: 7.168586E+00 | loss scale: 32768.0 | grad norm: 250612.086 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1643/ 159576 | consumed samples: 26288 | elapsed time per iteration (ms): 13686.3 | learning rate: 7.287E-06 | global batch size: 16 | lm loss: 7.042645E+00 | loss scale: 32768.0 | grad norm: 181688.669 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1644/ 159576 | consumed samples: 26304 | elapsed time per iteration (ms): 13617.6 | learning rate: 7.291E-06 | global batch size: 16 | lm loss: 6.992811E+00 | loss scale: 32768.0 | grad norm: 173387.997 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1645/ 159576 | consumed samples: 26320 | elapsed time per iteration (ms): 13588.3 | learning rate: 7.296E-06 | global batch size: 16 | lm loss: 6.948548E+00 | loss scale: 32768.0 | grad norm: 204171.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1646/ 159576 | consumed samples: 26336 | elapsed time per iteration (ms): 13943.8 | learning rate: 7.300E-06 | global batch size: 16 | lm loss: 7.227940E+00 | loss scale: 32768.0 | grad norm: 249546.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1647/ 159576 | consumed samples: 26352 | elapsed time per iteration (ms): 13526.7 | learning rate: 7.305E-06 | global batch size: 16 | lm loss: 7.150325E+00 | loss scale: 32768.0 | grad norm: 187163.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1648/ 159576 | consumed samples: 26368 | elapsed time per iteration (ms): 13689.1 | learning rate: 7.309E-06 | global batch size: 16 | lm loss: 7.017026E+00 | loss scale: 32768.0 | grad norm: 155331.100 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1649/ 159576 | consumed samples: 26384 | elapsed time per iteration (ms): 13592.0 | learning rate: 7.314E-06 | global batch size: 16 | lm loss: 6.946849E+00 | loss scale: 32768.0 | grad norm: 224463.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1650/ 159576 | consumed samples: 26400 | elapsed time per iteration (ms): 13576.3 | learning rate: 7.318E-06 | global batch size: 16 | lm loss: 7.179192E+00 | loss scale: 32768.0 | grad norm: 276611.361 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1651/ 159576 | consumed samples: 26416 | elapsed time per iteration (ms): 13958.1 | learning rate: 7.322E-06 | global batch size: 16 | lm loss: 7.176366E+00 | loss scale: 32768.0 | grad norm: 180366.507 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1652/ 159576 | consumed samples: 26432 | elapsed time per iteration (ms): 13632.4 | learning rate: 7.327E-06 | global batch size: 16 | lm loss: 7.206745E+00 | loss scale: 32768.0 | grad norm: 135845.317 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1653/ 159576 | consumed samples: 26448 | elapsed time per iteration (ms): 13613.1 | learning rate: 7.331E-06 | global batch size: 16 | lm loss: 7.259154E+00 | loss scale: 32768.0 | grad norm: 403068.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1654/ 159576 | consumed samples: 26464 | elapsed time per iteration (ms): 13593.5 | learning rate: 7.336E-06 | global batch size: 16 | lm loss: 7.201679E+00 | loss scale: 32768.0 | grad norm: 362463.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1655/ 159576 | consumed samples: 26480 | elapsed time per iteration (ms): 14016.8 | learning rate: 7.340E-06 | global batch size: 16 | lm loss: 7.291797E+00 | loss scale: 32768.0 | grad norm: 167369.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1656/ 159576 | consumed samples: 26496 | elapsed time per iteration (ms): 13699.1 | learning rate: 7.345E-06 | global batch size: 16 | lm loss: 7.091952E+00 | loss scale: 32768.0 | grad norm: 165135.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1657/ 159576 | consumed samples: 26512 | elapsed time per iteration (ms): 13569.2 | learning rate: 7.349E-06 | global batch size: 16 | lm loss: 7.068718E+00 | loss scale: 32768.0 | grad norm: 202181.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1658/ 159576 | consumed samples: 26528 | elapsed time per iteration (ms): 13577.2 | learning rate: 7.354E-06 | global batch size: 16 | lm loss: 7.233033E+00 | loss scale: 32768.0 | grad norm: 333361.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1659/ 159576 | consumed samples: 26544 | elapsed time per iteration (ms): 13970.5 | learning rate: 7.358E-06 | global batch size: 16 | lm loss: 7.330973E+00 | loss scale: 32768.0 | grad norm: 164401.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1660/ 159576 | consumed samples: 26560 | elapsed time per iteration (ms): 13585.6 | learning rate: 7.362E-06 | global batch size: 16 | lm loss: 7.127686E+00 | loss scale: 32768.0 | grad norm: 165830.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1661/ 159576 | consumed samples: 26576 | elapsed time per iteration (ms): 13601.7 | learning rate: 7.367E-06 | global batch size: 16 | lm loss: 7.202850E+00 | loss scale: 32768.0 | grad norm: 214035.250 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1662/ 159576 | consumed samples: 26592 | elapsed time per iteration (ms): 13596.7 | learning rate: 7.371E-06 | global batch size: 16 | lm loss: 7.194968E+00 | loss scale: 32768.0 | grad norm: 269427.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1663/ 159576 | consumed samples: 26608 | elapsed time per iteration (ms): 13626.2 | learning rate: 7.376E-06 | global batch size: 16 | lm loss: 7.079875E+00 | loss scale: 32768.0 | grad norm: 243204.527 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1664/ 159576 | consumed samples: 26624 | elapsed time per iteration (ms): 13820.6 | learning rate: 7.380E-06 | global batch size: 16 | lm loss: 7.253979E+00 | loss scale: 32768.0 | grad norm: 184892.216 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1665/ 159576 | consumed samples: 26640 | elapsed time per iteration (ms): 13606.7 | learning rate: 7.385E-06 | global batch size: 16 | lm loss: 7.021820E+00 | loss scale: 32768.0 | grad norm: 220398.877 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1666/ 159576 | consumed samples: 26656 | elapsed time per iteration (ms): 13594.3 | learning rate: 7.389E-06 | global batch size: 16 | lm loss: 7.115512E+00 | loss scale: 32768.0 | grad norm: 307682.966 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1667/ 159576 | consumed samples: 26672 | elapsed time per iteration (ms): 13584.1 | learning rate: 7.393E-06 | global batch size: 16 | lm loss: 7.301219E+00 | loss scale: 32768.0 | grad norm: 326739.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1668/ 159576 | consumed samples: 26688 | elapsed time per iteration (ms): 13934.9 | learning rate: 7.398E-06 | global batch size: 16 | lm loss: 7.091152E+00 | loss scale: 32768.0 | grad norm: 179218.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1669/ 159576 | consumed samples: 26704 | elapsed time per iteration (ms): 13576.9 | learning rate: 7.402E-06 | global batch size: 16 | lm loss: 7.060991E+00 | loss scale: 32768.0 | grad norm: 212478.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1670/ 159576 | consumed samples: 26720 | elapsed time per iteration (ms): 13622.1 | learning rate: 7.407E-06 | global batch size: 16 | lm loss: 7.225494E+00 | loss scale: 32768.0 | grad norm: 312859.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1671/ 159576 | consumed samples: 26736 | elapsed time per iteration (ms): 13558.9 | learning rate: 7.411E-06 | global batch size: 16 | lm loss: 6.931543E+00 | loss scale: 32768.0 | grad norm: 214910.265 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1672/ 159576 | consumed samples: 26752 | elapsed time per iteration (ms): 13593.0 | learning rate: 7.416E-06 | global batch size: 16 | lm loss: 7.111391E+00 | loss scale: 32768.0 | grad norm: 167374.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1673/ 159576 | consumed samples: 26768 | elapsed time per iteration (ms): 14083.5 | learning rate: 7.420E-06 | global batch size: 16 | lm loss: 7.119873E+00 | loss scale: 32768.0 | grad norm: 207656.393 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1674/ 159576 | consumed samples: 26784 | elapsed time per iteration (ms): 13580.7 | learning rate: 7.425E-06 | global batch size: 16 | lm loss: 7.190612E+00 | loss scale: 32768.0 | grad norm: 138716.556 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1675/ 159576 | consumed samples: 26800 | elapsed time per iteration (ms): 13560.5 | learning rate: 7.429E-06 | global batch size: 16 | lm loss: 7.118540E+00 | loss scale: 32768.0 | grad norm: 288523.946 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1676/ 159576 | consumed samples: 26816 | elapsed time per iteration (ms): 13591.4 | learning rate: 7.433E-06 | global batch size: 16 | lm loss: 7.228687E+00 | loss scale: 32768.0 | grad norm: 184651.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1677/ 159576 | consumed samples: 26832 | elapsed time per iteration (ms): 14019.3 | learning rate: 7.438E-06 | global batch size: 16 | lm loss: 7.062222E+00 | loss scale: 32768.0 | grad norm: 166988.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1678/ 159576 | consumed samples: 26848 | elapsed time per iteration (ms): 13663.4 | learning rate: 7.442E-06 | global batch size: 16 | lm loss: 7.206205E+00 | loss scale: 32768.0 | grad norm: 760966.811 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1679/ 159576 | consumed samples: 26864 | elapsed time per iteration (ms): 13583.3 | learning rate: 7.447E-06 | global batch size: 16 | lm loss: 7.183750E+00 | loss scale: 32768.0 | grad norm: 619056.103 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1680/ 159576 | consumed samples: 26880 | elapsed time per iteration (ms): 13598.8 | learning rate: 7.451E-06 | global batch size: 16 | lm loss: 7.188565E+00 | loss scale: 32768.0 | grad norm: 363445.728 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1681/ 159576 | consumed samples: 26896 | elapsed time per iteration (ms): 14083.3 | learning rate: 7.456E-06 | global batch size: 16 | lm loss: 7.135269E+00 | loss scale: 32768.0 | grad norm: 201434.725 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1682/ 159576 | consumed samples: 26912 | elapsed time per iteration (ms): 13432.4 | learning rate: 7.460E-06 | global batch size: 16 | lm loss: 7.080773E+00 | loss scale: 32768.0 | grad norm: 223123.023 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1683/ 159576 | consumed samples: 26928 | elapsed time per iteration (ms): 13629.9 | learning rate: 7.464E-06 | global batch size: 16 | lm loss: 7.018581E+00 | loss scale: 32768.0 | grad norm: 160716.882 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1684/ 159576 | consumed samples: 26944 | elapsed time per iteration (ms): 13543.1 | learning rate: 7.469E-06 | global batch size: 16 | lm loss: 7.045646E+00 | loss scale: 32768.0 | grad norm: 319366.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1685/ 159576 | consumed samples: 26960 | elapsed time per iteration (ms): 13556.0 | learning rate: 7.473E-06 | global batch size: 16 | lm loss: 7.139486E+00 | loss scale: 32768.0 | grad norm: 154250.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1686/ 159576 | consumed samples: 26976 | elapsed time per iteration (ms): 13875.3 | learning rate: 7.478E-06 | global batch size: 16 | lm loss: 7.146173E+00 | loss scale: 32768.0 | grad norm: 186495.170 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1687/ 159576 | consumed samples: 26992 | elapsed time per iteration (ms): 13583.8 | learning rate: 7.482E-06 | global batch size: 16 | lm loss: 7.207047E+00 | loss scale: 32768.0 | grad norm: 129574.140 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1688/ 159576 | consumed samples: 27008 | elapsed time per iteration (ms): 13590.1 | learning rate: 7.487E-06 | global batch size: 16 | lm loss: 7.150177E+00 | loss scale: 32768.0 | grad norm: 310199.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1689/ 159576 | consumed samples: 27024 | elapsed time per iteration (ms): 13636.7 | learning rate: 7.491E-06 | global batch size: 16 | lm loss: 7.136959E+00 | loss scale: 32768.0 | grad norm: 142456.264 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1690/ 159576 | consumed samples: 27040 | elapsed time per iteration (ms): 13898.3 | learning rate: 7.496E-06 | global batch size: 16 | lm loss: 6.991103E+00 | loss scale: 32768.0 | grad norm: 206942.247 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1691/ 159576 | consumed samples: 27056 | elapsed time per iteration (ms): 13637.0 | learning rate: 7.500E-06 | global batch size: 16 | lm loss: 7.147140E+00 | loss scale: 32768.0 | grad norm: 297164.074 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1692/ 159576 | consumed samples: 27072 | elapsed time per iteration (ms): 13592.2 | learning rate: 7.504E-06 | global batch size: 16 | lm loss: 7.166695E+00 | loss scale: 32768.0 | grad norm: 174829.948 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1693/ 159576 | consumed samples: 27088 | elapsed time per iteration (ms): 13634.0 | learning rate: 7.509E-06 | global batch size: 16 | lm loss: 7.124074E+00 | loss scale: 32768.0 | grad norm: 356202.604 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1694/ 159576 | consumed samples: 27104 | elapsed time per iteration (ms): 13929.9 | learning rate: 7.513E-06 | global batch size: 16 | lm loss: 7.219958E+00 | loss scale: 32768.0 | grad norm: 288764.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1695/ 159576 | consumed samples: 27120 | elapsed time per iteration (ms): 13812.8 | learning rate: 7.518E-06 | global batch size: 16 | lm loss: 7.030488E+00 | loss scale: 32768.0 | grad norm: 164638.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1696/ 159576 | consumed samples: 27136 | elapsed time per iteration (ms): 13601.5 | learning rate: 7.522E-06 | global batch size: 16 | lm loss: 7.288185E+00 | loss scale: 32768.0 | grad norm: 241747.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1697/ 159576 | consumed samples: 27152 | elapsed time per iteration (ms): 13619.0 | learning rate: 7.527E-06 | global batch size: 16 | lm loss: 7.110942E+00 | loss scale: 32768.0 | grad norm: 183251.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1698/ 159576 | consumed samples: 27168 | elapsed time per iteration (ms): 13580.4 | learning rate: 7.531E-06 | global batch size: 16 | lm loss: 7.096193E+00 | loss scale: 32768.0 | grad norm: 187930.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1699/ 159576 | consumed samples: 27184 | elapsed time per iteration (ms): 14055.7 | learning rate: 7.536E-06 | global batch size: 16 | lm loss: 6.976962E+00 | loss scale: 32768.0 | grad norm: 186599.931 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1700/ 159576 | consumed samples: 27200 | elapsed time per iteration (ms): 13642.0 | learning rate: 7.540E-06 | global batch size: 16 | lm loss: 6.916706E+00 | loss scale: 32768.0 | grad norm: 212948.424 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1701/ 159576 | consumed samples: 27216 | elapsed time per iteration (ms): 13615.0 | learning rate: 7.544E-06 | global batch size: 16 | lm loss: 7.194331E+00 | loss scale: 32768.0 | grad norm: 144812.346 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1702/ 159576 | consumed samples: 27232 | elapsed time per iteration (ms): 13551.3 | learning rate: 7.549E-06 | global batch size: 16 | lm loss: 7.139325E+00 | loss scale: 32768.0 | grad norm: 331590.334 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1703/ 159576 | consumed samples: 27248 | elapsed time per iteration (ms): 13973.8 | learning rate: 7.553E-06 | global batch size: 16 | lm loss: 7.042914E+00 | loss scale: 32768.0 | grad norm: 195366.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1704/ 159576 | consumed samples: 27264 | elapsed time per iteration (ms): 13614.8 | learning rate: 7.558E-06 | global batch size: 16 | lm loss: 7.087082E+00 | loss scale: 32768.0 | grad norm: 217381.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1705/ 159576 | consumed samples: 27280 | elapsed time per iteration (ms): 13611.2 | learning rate: 7.562E-06 | global batch size: 16 | lm loss: 7.013979E+00 | loss scale: 32768.0 | grad norm: 198091.797 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1706/ 159576 | consumed samples: 27296 | elapsed time per iteration (ms): 13574.3 | learning rate: 7.567E-06 | global batch size: 16 | lm loss: 7.016004E+00 | loss scale: 32768.0 | grad norm: 222098.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1707/ 159576 | consumed samples: 27312 | elapsed time per iteration (ms): 13629.3 | learning rate: 7.571E-06 | global batch size: 16 | lm loss: 7.175000E+00 | loss scale: 32768.0 | grad norm: 409215.441 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1708/ 159576 | consumed samples: 27328 | elapsed time per iteration (ms): 13904.2 | learning rate: 7.575E-06 | global batch size: 16 | lm loss: 7.071371E+00 | loss scale: 32768.0 | grad norm: 273410.975 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1709/ 159576 | consumed samples: 27344 | elapsed time per iteration (ms): 13558.1 | learning rate: 7.580E-06 | global batch size: 16 | lm loss: 7.002718E+00 | loss scale: 32768.0 | grad norm: 197884.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1710/ 159576 | consumed samples: 27360 | elapsed time per iteration (ms): 13639.3 | learning rate: 7.584E-06 | global batch size: 16 | lm loss: 7.323861E+00 | loss scale: 32768.0 | grad norm: 172073.111 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1711/ 159576 | consumed samples: 27376 | elapsed time per iteration (ms): 13631.6 | learning rate: 7.589E-06 | global batch size: 16 | lm loss: 6.922392E+00 | loss scale: 32768.0 | grad norm: 326721.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1712/ 159576 | consumed samples: 27392 | elapsed time per iteration (ms): 13982.8 | learning rate: 7.593E-06 | global batch size: 16 | lm loss: 7.148055E+00 | loss scale: 32768.0 | grad norm: 280337.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1713/ 159576 | consumed samples: 27408 | elapsed time per iteration (ms): 13635.8 | learning rate: 7.598E-06 | global batch size: 16 | lm loss: 7.088178E+00 | loss scale: 32768.0 | grad norm: 200762.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1714/ 159576 | consumed samples: 27424 | elapsed time per iteration (ms): 13581.9 | learning rate: 7.602E-06 | global batch size: 16 | lm loss: 7.096650E+00 | loss scale: 32768.0 | grad norm: 204299.283 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1715/ 159576 | consumed samples: 27440 | elapsed time per iteration (ms): 13647.6 | learning rate: 7.607E-06 | global batch size: 16 | lm loss: 6.916616E+00 | loss scale: 32768.0 | grad norm: 127407.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1716/ 159576 | consumed samples: 27456 | elapsed time per iteration (ms): 13904.0 | learning rate: 7.611E-06 | global batch size: 16 | lm loss: 7.066643E+00 | loss scale: 32768.0 | grad norm: 371440.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1717/ 159576 | consumed samples: 27472 | elapsed time per iteration (ms): 13717.4 | learning rate: 7.615E-06 | global batch size: 16 | lm loss: 7.332389E+00 | loss scale: 32768.0 | grad norm: 403592.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1718/ 159576 | consumed samples: 27488 | elapsed time per iteration (ms): 13591.7 | learning rate: 7.620E-06 | global batch size: 16 | lm loss: 7.055027E+00 | loss scale: 32768.0 | grad norm: 200151.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1719/ 159576 | consumed samples: 27504 | elapsed time per iteration (ms): 13560.8 | learning rate: 7.624E-06 | global batch size: 16 | lm loss: 7.176567E+00 | loss scale: 32768.0 | grad norm: 144423.577 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1720/ 159576 | consumed samples: 27520 | elapsed time per iteration (ms): 13600.7 | learning rate: 7.629E-06 | global batch size: 16 | lm loss: 6.984463E+00 | loss scale: 32768.0 | grad norm: 303766.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1721/ 159576 | consumed samples: 27536 | elapsed time per iteration (ms): 13892.8 | learning rate: 7.633E-06 | global batch size: 16 | lm loss: 6.990324E+00 | loss scale: 32768.0 | grad norm: 154861.936 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1722/ 159576 | consumed samples: 27552 | elapsed time per iteration (ms): 13527.0 | learning rate: 7.638E-06 | global batch size: 16 | lm loss: 7.238751E+00 | loss scale: 32768.0 | grad norm: 231731.625 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1723/ 159576 | consumed samples: 27568 | elapsed time per iteration (ms): 13536.8 | learning rate: 7.642E-06 | global batch size: 16 | lm loss: 7.130395E+00 | loss scale: 32768.0 | grad norm: 190824.462 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1724/ 159576 | consumed samples: 27584 | elapsed time per iteration (ms): 13580.6 | learning rate: 7.646E-06 | global batch size: 16 | lm loss: 7.182058E+00 | loss scale: 32768.0 | grad norm: 266208.840 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1725/ 159576 | consumed samples: 27600 | elapsed time per iteration (ms): 13961.0 | learning rate: 7.651E-06 | global batch size: 16 | lm loss: 7.108085E+00 | loss scale: 32768.0 | grad norm: 284420.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1726/ 159576 | consumed samples: 27616 | elapsed time per iteration (ms): 13537.5 | learning rate: 7.655E-06 | global batch size: 16 | lm loss: 7.049166E+00 | loss scale: 32768.0 | grad norm: 189929.247 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1727/ 159576 | consumed samples: 27632 | elapsed time per iteration (ms): 13583.4 | learning rate: 7.660E-06 | global batch size: 16 | lm loss: 7.012967E+00 | loss scale: 32768.0 | grad norm: 174720.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1728/ 159576 | consumed samples: 27648 | elapsed time per iteration (ms): 13605.5 | learning rate: 7.664E-06 | global batch size: 16 | lm loss: 7.237570E+00 | loss scale: 32768.0 | grad norm: 194798.770 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1729/ 159576 | consumed samples: 27664 | elapsed time per iteration (ms): 13552.5 | learning rate: 7.669E-06 | global batch size: 16 | lm loss: 7.138112E+00 | loss scale: 32768.0 | grad norm: 289252.424 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1730/ 159576 | consumed samples: 27680 | elapsed time per iteration (ms): 14055.9 | learning rate: 7.673E-06 | global batch size: 16 | lm loss: 7.041800E+00 | loss scale: 32768.0 | grad norm: 190020.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1731/ 159576 | consumed samples: 27696 | elapsed time per iteration (ms): 13571.4 | learning rate: 7.678E-06 | global batch size: 16 | lm loss: 7.037878E+00 | loss scale: 32768.0 | grad norm: 149538.464 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1732/ 159576 | consumed samples: 27712 | elapsed time per iteration (ms): 13585.4 | learning rate: 7.682E-06 | global batch size: 16 | lm loss: 7.179647E+00 | loss scale: 32768.0 | grad norm: 151351.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1733/ 159576 | consumed samples: 27728 | elapsed time per iteration (ms): 13582.2 | learning rate: 7.686E-06 | global batch size: 16 | lm loss: 7.234662E+00 | loss scale: 32768.0 | grad norm: 317716.715 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1734/ 159576 | consumed samples: 27744 | elapsed time per iteration (ms): 14148.8 | learning rate: 7.691E-06 | global batch size: 16 | lm loss: 7.306998E+00 | loss scale: 32768.0 | grad norm: 216190.319 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1735/ 159576 | consumed samples: 27760 | elapsed time per iteration (ms): 13664.2 | learning rate: 7.695E-06 | global batch size: 16 | lm loss: 7.130812E+00 | loss scale: 32768.0 | grad norm: 168041.258 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1736/ 159576 | consumed samples: 27776 | elapsed time per iteration (ms): 13539.2 | learning rate: 7.700E-06 | global batch size: 16 | lm loss: 7.164721E+00 | loss scale: 32768.0 | grad norm: 189764.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1737/ 159576 | consumed samples: 27792 | elapsed time per iteration (ms): 13580.1 | learning rate: 7.704E-06 | global batch size: 16 | lm loss: 7.213598E+00 | loss scale: 32768.0 | grad norm: 231432.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1738/ 159576 | consumed samples: 27808 | elapsed time per iteration (ms): 13874.0 | learning rate: 7.709E-06 | global batch size: 16 | lm loss: 7.064263E+00 | loss scale: 32768.0 | grad norm: 332299.668 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1739/ 159576 | consumed samples: 27824 | elapsed time per iteration (ms): 13542.8 | learning rate: 7.713E-06 | global batch size: 16 | lm loss: 7.187717E+00 | loss scale: 32768.0 | grad norm: 159503.470 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1740/ 159576 | consumed samples: 27840 | elapsed time per iteration (ms): 13564.1 | learning rate: 7.717E-06 | global batch size: 16 | lm loss: 7.212025E+00 | loss scale: 32768.0 | grad norm: 275497.658 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1741/ 159576 | consumed samples: 27856 | elapsed time per iteration (ms): 13584.8 | learning rate: 7.722E-06 | global batch size: 16 | lm loss: 6.960712E+00 | loss scale: 32768.0 | grad norm: 307419.828 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1742/ 159576 | consumed samples: 27872 | elapsed time per iteration (ms): 13621.1 | learning rate: 7.726E-06 | global batch size: 16 | lm loss: 7.086576E+00 | loss scale: 32768.0 | grad norm: 156758.997 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1743/ 159576 | consumed samples: 27888 | elapsed time per iteration (ms): 13719.9 | learning rate: 7.731E-06 | global batch size: 16 | lm loss: 6.961288E+00 | loss scale: 32768.0 | grad norm: 147761.212 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1744/ 159576 | consumed samples: 27904 | elapsed time per iteration (ms): 13570.6 | learning rate: 7.735E-06 | global batch size: 16 | lm loss: 7.320576E+00 | loss scale: 32768.0 | grad norm: 309786.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1745/ 159576 | consumed samples: 27920 | elapsed time per iteration (ms): 13600.3 | learning rate: 7.740E-06 | global batch size: 16 | lm loss: 7.218632E+00 | loss scale: 32768.0 | grad norm: 330698.583 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1746/ 159576 | consumed samples: 27936 | elapsed time per iteration (ms): 13548.3 | learning rate: 7.744E-06 | global batch size: 16 | lm loss: 7.139973E+00 | loss scale: 32768.0 | grad norm: 376967.322 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1747/ 159576 | consumed samples: 27952 | elapsed time per iteration (ms): 13954.3 | learning rate: 7.749E-06 | global batch size: 16 | lm loss: 7.074110E+00 | loss scale: 32768.0 | grad norm: 214147.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1748/ 159576 | consumed samples: 27968 | elapsed time per iteration (ms): 13621.8 | learning rate: 7.753E-06 | global batch size: 16 | lm loss: 7.254288E+00 | loss scale: 32768.0 | grad norm: 128937.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1749/ 159576 | consumed samples: 27984 | elapsed time per iteration (ms): 13626.6 | learning rate: 7.757E-06 | global batch size: 16 | lm loss: 7.009082E+00 | loss scale: 32768.0 | grad norm: 392446.478 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1750/ 159576 | consumed samples: 28000 | elapsed time per iteration (ms): 13590.6 | learning rate: 7.762E-06 | global batch size: 16 | lm loss: 6.949193E+00 | loss scale: 32768.0 | grad norm: 205911.332 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1751/ 159576 | consumed samples: 28016 | elapsed time per iteration (ms): 13916.9 | learning rate: 7.766E-06 | global batch size: 16 | lm loss: 7.175614E+00 | loss scale: 32768.0 | grad norm: 181359.266 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1752/ 159576 | consumed samples: 28032 | elapsed time per iteration (ms): 13747.5 | learning rate: 7.771E-06 | global batch size: 16 | lm loss: 7.084972E+00 | loss scale: 32768.0 | grad norm: 191810.333 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1753/ 159576 | consumed samples: 28048 | elapsed time per iteration (ms): 13591.1 | learning rate: 7.775E-06 | global batch size: 16 | lm loss: 7.125815E+00 | loss scale: 32768.0 | grad norm: 150833.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1754/ 159576 | consumed samples: 28064 | elapsed time per iteration (ms): 13552.4 | learning rate: 7.780E-06 | global batch size: 16 | lm loss: 7.096021E+00 | loss scale: 32768.0 | grad norm: 858159.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1755/ 159576 | consumed samples: 28080 | elapsed time per iteration (ms): 13586.8 | learning rate: 7.784E-06 | global batch size: 16 | lm loss: 7.401230E+00 | loss scale: 32768.0 | grad norm: 1015122.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1756/ 159576 | consumed samples: 28096 | elapsed time per iteration (ms): 14062.7 | learning rate: 7.788E-06 | global batch size: 16 | lm loss: 7.141807E+00 | loss scale: 32768.0 | grad norm: 241473.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1757/ 159576 | consumed samples: 28112 | elapsed time per iteration (ms): 13654.9 | learning rate: 7.793E-06 | global batch size: 16 | lm loss: 7.055682E+00 | loss scale: 32768.0 | grad norm: 195258.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1758/ 159576 | consumed samples: 28128 | elapsed time per iteration (ms): 13576.6 | learning rate: 7.797E-06 | global batch size: 16 | lm loss: 6.887124E+00 | loss scale: 32768.0 | grad norm: 209948.309 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1759/ 159576 | consumed samples: 28144 | elapsed time per iteration (ms): 13615.8 | learning rate: 7.802E-06 | global batch size: 16 | lm loss: 7.008955E+00 | loss scale: 32768.0 | grad norm: 218109.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1760/ 159576 | consumed samples: 28160 | elapsed time per iteration (ms): 13880.5 | learning rate: 7.806E-06 | global batch size: 16 | lm loss: 7.156555E+00 | loss scale: 32768.0 | grad norm: 199049.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1761/ 159576 | consumed samples: 28176 | elapsed time per iteration (ms): 13559.3 | learning rate: 7.811E-06 | global batch size: 16 | lm loss: 7.445184E+00 | loss scale: 32768.0 | grad norm: 571721.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1762/ 159576 | consumed samples: 28192 | elapsed time per iteration (ms): 13597.9 | learning rate: 7.815E-06 | global batch size: 16 | lm loss: 7.408930E+00 | loss scale: 32768.0 | grad norm: 477324.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1763/ 159576 | consumed samples: 28208 | elapsed time per iteration (ms): 13646.1 | learning rate: 7.820E-06 | global batch size: 16 | lm loss: 7.228862E+00 | loss scale: 32768.0 | grad norm: 183806.995 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1764/ 159576 | consumed samples: 28224 | elapsed time per iteration (ms): 13595.0 | learning rate: 7.824E-06 | global batch size: 16 | lm loss: 7.213759E+00 | loss scale: 32768.0 | grad norm: 199120.863 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1765/ 159576 | consumed samples: 28240 | elapsed time per iteration (ms): 13787.5 | learning rate: 7.828E-06 | global batch size: 16 | lm loss: 7.190694E+00 | loss scale: 32768.0 | grad norm: 230903.246 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1766/ 159576 | consumed samples: 28256 | elapsed time per iteration (ms): 13655.5 | learning rate: 7.833E-06 | global batch size: 16 | lm loss: 7.120300E+00 | loss scale: 32768.0 | grad norm: 189933.714 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1767/ 159576 | consumed samples: 28272 | elapsed time per iteration (ms): 13622.2 | learning rate: 7.837E-06 | global batch size: 16 | lm loss: 7.216377E+00 | loss scale: 32768.0 | grad norm: 411494.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1768/ 159576 | consumed samples: 28288 | elapsed time per iteration (ms): 13542.4 | learning rate: 7.842E-06 | global batch size: 16 | lm loss: 7.296084E+00 | loss scale: 32768.0 | grad norm: 211545.292 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1769/ 159576 | consumed samples: 28304 | elapsed time per iteration (ms): 13835.8 | learning rate: 7.846E-06 | global batch size: 16 | lm loss: 7.014185E+00 | loss scale: 32768.0 | grad norm: 159627.354 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1770/ 159576 | consumed samples: 28320 | elapsed time per iteration (ms): 13625.8 | learning rate: 7.851E-06 | global batch size: 16 | lm loss: 7.260840E+00 | loss scale: 32768.0 | grad norm: 230760.798 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 09:03:50] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 09:03:50] PULSE: tr8-104B is running for 3:11:39 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 1771/ 159576 | consumed samples: 28336 | elapsed time per iteration (ms): 13609.6 | learning rate: 7.855E-06 | global batch size: 16 | lm loss: 7.096549E+00 | loss scale: 32768.0 | grad norm: 208126.291 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1772/ 159576 | consumed samples: 28352 | elapsed time per iteration (ms): 13612.5 | learning rate: 7.859E-06 | global batch size: 16 | lm loss: 7.288601E+00 | loss scale: 32768.0 | grad norm: 299861.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1773/ 159576 | consumed samples: 28368 | elapsed time per iteration (ms): 14036.1 | learning rate: 7.864E-06 | global batch size: 16 | lm loss: 7.006525E+00 | loss scale: 32768.0 | grad norm: 221185.737 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1774/ 159576 | consumed samples: 28384 | elapsed time per iteration (ms): 13455.1 | learning rate: 7.868E-06 | global batch size: 16 | lm loss: 7.057816E+00 | loss scale: 32768.0 | grad norm: 211669.427 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1775/ 159576 | consumed samples: 28400 | elapsed time per iteration (ms): 13580.5 | learning rate: 7.873E-06 | global batch size: 16 | lm loss: 7.225205E+00 | loss scale: 32768.0 | grad norm: 232985.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1776/ 159576 | consumed samples: 28416 | elapsed time per iteration (ms): 13577.7 | learning rate: 7.877E-06 | global batch size: 16 | lm loss: 7.090505E+00 | loss scale: 32768.0 | grad norm: 148862.985 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1777/ 159576 | consumed samples: 28432 | elapsed time per iteration (ms): 13633.9 | learning rate: 7.882E-06 | global batch size: 16 | lm loss: 7.291343E+00 | loss scale: 32768.0 | grad norm: 241931.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1778/ 159576 | consumed samples: 28448 | elapsed time per iteration (ms): 13810.9 | learning rate: 7.886E-06 | global batch size: 16 | lm loss: 7.168088E+00 | loss scale: 32768.0 | grad norm: 186155.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1779/ 159576 | consumed samples: 28464 | elapsed time per iteration (ms): 13677.6 | learning rate: 7.891E-06 | global batch size: 16 | lm loss: 6.975587E+00 | loss scale: 32768.0 | grad norm: 141385.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1780/ 159576 | consumed samples: 28480 | elapsed time per iteration (ms): 13699.5 | learning rate: 7.895E-06 | global batch size: 16 | lm loss: 7.234455E+00 | loss scale: 32768.0 | grad norm: 167275.043 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1781/ 159576 | consumed samples: 28496 | elapsed time per iteration (ms): 13560.1 | learning rate: 7.899E-06 | global batch size: 16 | lm loss: 7.118816E+00 | loss scale: 32768.0 | grad norm: 185745.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1782/ 159576 | consumed samples: 28512 | elapsed time per iteration (ms): 14007.0 | learning rate: 7.904E-06 | global batch size: 16 | lm loss: 7.325441E+00 | loss scale: 32768.0 | grad norm: 151237.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1783/ 159576 | consumed samples: 28528 | elapsed time per iteration (ms): 13468.4 | learning rate: 7.908E-06 | global batch size: 16 | lm loss: 6.976577E+00 | loss scale: 32768.0 | grad norm: 157950.458 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1784/ 159576 | consumed samples: 28544 | elapsed time per iteration (ms): 13610.8 | learning rate: 7.913E-06 | global batch size: 16 | lm loss: 7.151215E+00 | loss scale: 32768.0 | grad norm: 185745.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1785/ 159576 | consumed samples: 28560 | elapsed time per iteration (ms): 13574.9 | learning rate: 7.917E-06 | global batch size: 16 | lm loss: 6.982706E+00 | loss scale: 32768.0 | grad norm: 212394.757 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1786/ 159576 | consumed samples: 28576 | elapsed time per iteration (ms): 13593.1 | learning rate: 7.922E-06 | global batch size: 16 | lm loss: 7.090255E+00 | loss scale: 32768.0 | grad norm: 165476.788 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1787/ 159576 | consumed samples: 28592 | elapsed time per iteration (ms): 13825.7 | learning rate: 7.926E-06 | global batch size: 16 | lm loss: 7.190539E+00 | loss scale: 32768.0 | grad norm: 105058.438 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1788/ 159576 | consumed samples: 28608 | elapsed time per iteration (ms): 13613.9 | learning rate: 7.930E-06 | global batch size: 16 | lm loss: 6.849520E+00 | loss scale: 32768.0 | grad norm: 180790.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1789/ 159576 | consumed samples: 28624 | elapsed time per iteration (ms): 13633.8 | learning rate: 7.935E-06 | global batch size: 16 | lm loss: 7.203046E+00 | loss scale: 32768.0 | grad norm: 126112.335 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1790/ 159576 | consumed samples: 28640 | elapsed time per iteration (ms): 13618.2 | learning rate: 7.939E-06 | global batch size: 16 | lm loss: 7.073618E+00 | loss scale: 32768.0 | grad norm: 138120.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1791/ 159576 | consumed samples: 28656 | elapsed time per iteration (ms): 14044.8 | learning rate: 7.944E-06 | global batch size: 16 | lm loss: 7.193256E+00 | loss scale: 32768.0 | grad norm: 127392.206 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1792/ 159576 | consumed samples: 28672 | elapsed time per iteration (ms): 13675.9 | learning rate: 7.948E-06 | global batch size: 16 | lm loss: 7.182660E+00 | loss scale: 32768.0 | grad norm: 128828.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1793/ 159576 | consumed samples: 28688 | elapsed time per iteration (ms): 13639.0 | learning rate: 7.953E-06 | global batch size: 16 | lm loss: 7.029709E+00 | loss scale: 32768.0 | grad norm: 123453.201 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1794/ 159576 | consumed samples: 28704 | elapsed time per iteration (ms): 13728.8 | learning rate: 7.957E-06 | global batch size: 16 | lm loss: 7.166730E+00 | loss scale: 32768.0 | grad norm: 117050.511 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1795/ 159576 | consumed samples: 28720 | elapsed time per iteration (ms): 13951.0 | learning rate: 7.962E-06 | global batch size: 16 | lm loss: 7.100776E+00 | loss scale: 32768.0 | grad norm: 166379.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1796/ 159576 | consumed samples: 28736 | elapsed time per iteration (ms): 13626.1 | learning rate: 7.966E-06 | global batch size: 16 | lm loss: 7.059687E+00 | loss scale: 32768.0 | grad norm: 165877.869 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1797/ 159576 | consumed samples: 28752 | elapsed time per iteration (ms): 13658.2 | learning rate: 7.970E-06 | global batch size: 16 | lm loss: 7.128800E+00 | loss scale: 32768.0 | grad norm: 241870.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1798/ 159576 | consumed samples: 28768 | elapsed time per iteration (ms): 13547.6 | learning rate: 7.975E-06 | global batch size: 16 | lm loss: 6.884446E+00 | loss scale: 32768.0 | grad norm: 129845.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1799/ 159576 | consumed samples: 28784 | elapsed time per iteration (ms): 13614.6 | learning rate: 7.979E-06 | global batch size: 16 | lm loss: 7.309677E+00 | loss scale: 32768.0 | grad norm: 156206.470 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1800/ 159576 | consumed samples: 28800 | elapsed time per iteration (ms): 13719.1 | learning rate: 7.984E-06 | global batch size: 16 | lm loss: 6.891129E+00 | loss scale: 32768.0 | grad norm: 130612.475 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1801/ 159576 | consumed samples: 28816 | elapsed time per iteration (ms): 13709.3 | learning rate: 7.988E-06 | global batch size: 16 | lm loss: 7.259354E+00 | loss scale: 32768.0 | grad norm: 299631.068 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1802/ 159576 | consumed samples: 28832 | elapsed time per iteration (ms): 13702.3 | learning rate: 7.993E-06 | global batch size: 16 | lm loss: 7.091782E+00 | loss scale: 32768.0 | grad norm: 164547.713 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1803/ 159576 | consumed samples: 28848 | elapsed time per iteration (ms): 13667.9 | learning rate: 7.997E-06 | global batch size: 16 | lm loss: 7.081347E+00 | loss scale: 32768.0 | grad norm: 157884.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1804/ 159576 | consumed samples: 28864 | elapsed time per iteration (ms): 14087.7 | learning rate: 8.001E-06 | global batch size: 16 | lm loss: 7.043708E+00 | loss scale: 32768.0 | grad norm: 179047.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1805/ 159576 | consumed samples: 28880 | elapsed time per iteration (ms): 13636.0 | learning rate: 8.006E-06 | global batch size: 16 | lm loss: 7.153672E+00 | loss scale: 32768.0 | grad norm: 171473.191 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1806/ 159576 | consumed samples: 28896 | elapsed time per iteration (ms): 13563.1 | learning rate: 8.010E-06 | global batch size: 16 | lm loss: 7.067021E+00 | loss scale: 32768.0 | grad norm: 114434.243 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1807/ 159576 | consumed samples: 28912 | elapsed time per iteration (ms): 13653.6 | learning rate: 8.015E-06 | global batch size: 16 | lm loss: 7.234491E+00 | loss scale: 32768.0 | grad norm: 149275.670 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1808/ 159576 | consumed samples: 28928 | elapsed time per iteration (ms): 13997.0 | learning rate: 8.019E-06 | global batch size: 16 | lm loss: 7.015783E+00 | loss scale: 32768.0 | grad norm: 179254.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1809/ 159576 | consumed samples: 28944 | elapsed time per iteration (ms): 13813.5 | learning rate: 8.024E-06 | global batch size: 16 | lm loss: 7.176732E+00 | loss scale: 32768.0 | grad norm: 180477.986 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1810/ 159576 | consumed samples: 28960 | elapsed time per iteration (ms): 13672.4 | learning rate: 8.028E-06 | global batch size: 16 | lm loss: 6.590204E+00 | loss scale: 32768.0 | grad norm: 149127.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1811/ 159576 | consumed samples: 28976 | elapsed time per iteration (ms): 13741.3 | learning rate: 8.033E-06 | global batch size: 16 | lm loss: 7.100949E+00 | loss scale: 32768.0 | grad norm: 133004.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1812/ 159576 | consumed samples: 28992 | elapsed time per iteration (ms): 13598.0 | learning rate: 8.037E-06 | global batch size: 16 | lm loss: 7.268322E+00 | loss scale: 32768.0 | grad norm: 287887.492 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1813/ 159576 | consumed samples: 29008 | elapsed time per iteration (ms): 13826.0 | learning rate: 8.041E-06 | global batch size: 16 | lm loss: 7.048282E+00 | loss scale: 32768.0 | grad norm: 147045.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1814/ 159576 | consumed samples: 29024 | elapsed time per iteration (ms): 13651.5 | learning rate: 8.046E-06 | global batch size: 16 | lm loss: 7.168237E+00 | loss scale: 32768.0 | grad norm: 167345.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1815/ 159576 | consumed samples: 29040 | elapsed time per iteration (ms): 13646.2 | learning rate: 8.050E-06 | global batch size: 16 | lm loss: 6.976926E+00 | loss scale: 32768.0 | grad norm: 173193.629 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1816/ 159576 | consumed samples: 29056 | elapsed time per iteration (ms): 13708.4 | learning rate: 8.055E-06 | global batch size: 16 | lm loss: 7.173286E+00 | loss scale: 32768.0 | grad norm: 156812.836 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1817/ 159576 | consumed samples: 29072 | elapsed time per iteration (ms): 14056.6 | learning rate: 8.059E-06 | global batch size: 16 | lm loss: 7.191895E+00 | loss scale: 32768.0 | grad norm: 254989.804 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1818/ 159576 | consumed samples: 29088 | elapsed time per iteration (ms): 13727.1 | learning rate: 8.064E-06 | global batch size: 16 | lm loss: 7.070405E+00 | loss scale: 32768.0 | grad norm: 128138.350 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1819/ 159576 | consumed samples: 29104 | elapsed time per iteration (ms): 13606.2 | learning rate: 8.068E-06 | global batch size: 16 | lm loss: 6.955974E+00 | loss scale: 32768.0 | grad norm: 140247.528 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1820/ 159576 | consumed samples: 29120 | elapsed time per iteration (ms): 13652.5 | learning rate: 8.072E-06 | global batch size: 16 | lm loss: 7.029711E+00 | loss scale: 32768.0 | grad norm: 153040.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1821/ 159576 | consumed samples: 29136 | elapsed time per iteration (ms): 13671.5 | learning rate: 8.077E-06 | global batch size: 16 | lm loss: 7.097312E+00 | loss scale: 32768.0 | grad norm: 168364.904 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1822/ 159576 | consumed samples: 29152 | elapsed time per iteration (ms): 13964.1 | learning rate: 8.081E-06 | global batch size: 16 | lm loss: 7.163728E+00 | loss scale: 32768.0 | grad norm: 143592.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1823/ 159576 | consumed samples: 29168 | elapsed time per iteration (ms): 13677.5 | learning rate: 8.086E-06 | global batch size: 16 | lm loss: 7.161910E+00 | loss scale: 32768.0 | grad norm: 232336.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1824/ 159576 | consumed samples: 29184 | elapsed time per iteration (ms): 13682.4 | learning rate: 8.090E-06 | global batch size: 16 | lm loss: 7.241871E+00 | loss scale: 32768.0 | grad norm: 136988.706 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1825/ 159576 | consumed samples: 29200 | elapsed time per iteration (ms): 13681.2 | learning rate: 8.095E-06 | global batch size: 16 | lm loss: 6.885506E+00 | loss scale: 32768.0 | grad norm: 147212.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1826/ 159576 | consumed samples: 29216 | elapsed time per iteration (ms): 14107.7 | learning rate: 8.099E-06 | global batch size: 16 | lm loss: 7.094235E+00 | loss scale: 32768.0 | grad norm: 210358.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1827/ 159576 | consumed samples: 29232 | elapsed time per iteration (ms): 13698.2 | learning rate: 8.104E-06 | global batch size: 16 | lm loss: 6.987474E+00 | loss scale: 32768.0 | grad norm: 200444.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1828/ 159576 | consumed samples: 29248 | elapsed time per iteration (ms): 13646.3 | learning rate: 8.108E-06 | global batch size: 16 | lm loss: 7.024292E+00 | loss scale: 32768.0 | grad norm: 144708.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1829/ 159576 | consumed samples: 29264 | elapsed time per iteration (ms): 13672.0 | learning rate: 8.112E-06 | global batch size: 16 | lm loss: 7.101940E+00 | loss scale: 32768.0 | grad norm: 137983.288 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1830/ 159576 | consumed samples: 29280 | elapsed time per iteration (ms): 13973.1 | learning rate: 8.117E-06 | global batch size: 16 | lm loss: 6.950300E+00 | loss scale: 32768.0 | grad norm: 228570.073 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1831/ 159576 | consumed samples: 29296 | elapsed time per iteration (ms): 13712.1 | learning rate: 8.121E-06 | global batch size: 16 | lm loss: 7.000825E+00 | loss scale: 32768.0 | grad norm: 204009.839 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1832/ 159576 | consumed samples: 29312 | elapsed time per iteration (ms): 13734.6 | learning rate: 8.126E-06 | global batch size: 16 | lm loss: 7.021888E+00 | loss scale: 32768.0 | grad norm: 168698.722 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1833/ 159576 | consumed samples: 29328 | elapsed time per iteration (ms): 13643.1 | learning rate: 8.130E-06 | global batch size: 16 | lm loss: 6.956877E+00 | loss scale: 32768.0 | grad norm: 139702.257 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1834/ 159576 | consumed samples: 29344 | elapsed time per iteration (ms): 13670.0 | learning rate: 8.135E-06 | global batch size: 16 | lm loss: 7.078534E+00 | loss scale: 32768.0 | grad norm: 220188.892 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1835/ 159576 | consumed samples: 29360 | elapsed time per iteration (ms): 13786.5 | learning rate: 8.139E-06 | global batch size: 16 | lm loss: 7.145173E+00 | loss scale: 32768.0 | grad norm: 181620.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1836/ 159576 | consumed samples: 29376 | elapsed time per iteration (ms): 13684.7 | learning rate: 8.143E-06 | global batch size: 16 | lm loss: 7.147571E+00 | loss scale: 32768.0 | grad norm: 148241.508 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1837/ 159576 | consumed samples: 29392 | elapsed time per iteration (ms): 13650.8 | learning rate: 8.148E-06 | global batch size: 16 | lm loss: 7.198610E+00 | loss scale: 32768.0 | grad norm: 129198.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1838/ 159576 | consumed samples: 29408 | elapsed time per iteration (ms): 13689.6 | learning rate: 8.152E-06 | global batch size: 16 | lm loss: 7.077027E+00 | loss scale: 32768.0 | grad norm: 179805.881 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1839/ 159576 | consumed samples: 29424 | elapsed time per iteration (ms): 14193.0 | learning rate: 8.157E-06 | global batch size: 16 | lm loss: 7.034157E+00 | loss scale: 32768.0 | grad norm: 179474.021 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1840/ 159576 | consumed samples: 29440 | elapsed time per iteration (ms): 13593.3 | learning rate: 8.161E-06 | global batch size: 16 | lm loss: 7.132106E+00 | loss scale: 32768.0 | grad norm: 138966.354 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1841/ 159576 | consumed samples: 29456 | elapsed time per iteration (ms): 13717.8 | learning rate: 8.166E-06 | global batch size: 16 | lm loss: 7.290091E+00 | loss scale: 32768.0 | grad norm: 176321.035 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1842/ 159576 | consumed samples: 29472 | elapsed time per iteration (ms): 13672.3 | learning rate: 8.170E-06 | global batch size: 16 | lm loss: 7.222583E+00 | loss scale: 32768.0 | grad norm: 157190.685 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1843/ 159576 | consumed samples: 29488 | elapsed time per iteration (ms): 14041.0 | learning rate: 8.175E-06 | global batch size: 16 | lm loss: 7.080160E+00 | loss scale: 32768.0 | grad norm: 209951.002 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1844/ 159576 | consumed samples: 29504 | elapsed time per iteration (ms): 13687.6 | learning rate: 8.179E-06 | global batch size: 16 | lm loss: 7.044501E+00 | loss scale: 32768.0 | grad norm: 148871.965 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1845/ 159576 | consumed samples: 29520 | elapsed time per iteration (ms): 13645.6 | learning rate: 8.183E-06 | global batch size: 16 | lm loss: 7.157808E+00 | loss scale: 32768.0 | grad norm: 274735.365 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1846/ 159576 | consumed samples: 29536 | elapsed time per iteration (ms): 13730.4 | learning rate: 8.188E-06 | global batch size: 16 | lm loss: 6.885038E+00 | loss scale: 32768.0 | grad norm: 152141.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1847/ 159576 | consumed samples: 29552 | elapsed time per iteration (ms): 13619.7 | learning rate: 8.192E-06 | global batch size: 16 | lm loss: 7.235194E+00 | loss scale: 32768.0 | grad norm: 176093.463 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1848/ 159576 | consumed samples: 29568 | elapsed time per iteration (ms): 13886.2 | learning rate: 8.197E-06 | global batch size: 16 | lm loss: 7.254928E+00 | loss scale: 32768.0 | grad norm: 205754.293 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1849/ 159576 | consumed samples: 29584 | elapsed time per iteration (ms): 13743.9 | learning rate: 8.201E-06 | global batch size: 16 | lm loss: 7.040710E+00 | loss scale: 32768.0 | grad norm: 218799.146 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1850/ 159576 | consumed samples: 29600 | elapsed time per iteration (ms): 13589.2 | learning rate: 8.206E-06 | global batch size: 16 | lm loss: 7.048983E+00 | loss scale: 32768.0 | grad norm: 207680.104 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1851/ 159576 | consumed samples: 29616 | elapsed time per iteration (ms): 13643.5 | learning rate: 8.210E-06 | global batch size: 16 | lm loss: 7.264068E+00 | loss scale: 32768.0 | grad norm: 172145.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1852/ 159576 | consumed samples: 29632 | elapsed time per iteration (ms): 14007.8 | learning rate: 8.214E-06 | global batch size: 16 | lm loss: 7.091225E+00 | loss scale: 32768.0 | grad norm: 165885.271 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1853/ 159576 | consumed samples: 29648 | elapsed time per iteration (ms): 13621.7 | learning rate: 8.219E-06 | global batch size: 16 | lm loss: 7.004953E+00 | loss scale: 32768.0 | grad norm: 193763.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1854/ 159576 | consumed samples: 29664 | elapsed time per iteration (ms): 13705.7 | learning rate: 8.223E-06 | global batch size: 16 | lm loss: 7.337306E+00 | loss scale: 32768.0 | grad norm: 334165.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1855/ 159576 | consumed samples: 29680 | elapsed time per iteration (ms): 13688.7 | learning rate: 8.228E-06 | global batch size: 16 | lm loss: 7.088278E+00 | loss scale: 32768.0 | grad norm: 168305.003 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1856/ 159576 | consumed samples: 29696 | elapsed time per iteration (ms): 14064.4 | learning rate: 8.232E-06 | global batch size: 16 | lm loss: 7.075657E+00 | loss scale: 32768.0 | grad norm: 146104.687 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1857/ 159576 | consumed samples: 29712 | elapsed time per iteration (ms): 13622.8 | learning rate: 8.237E-06 | global batch size: 16 | lm loss: 7.326543E+00 | loss scale: 32768.0 | grad norm: 226986.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1858/ 159576 | consumed samples: 29728 | elapsed time per iteration (ms): 13661.1 | learning rate: 8.241E-06 | global batch size: 16 | lm loss: 7.226311E+00 | loss scale: 32768.0 | grad norm: 127252.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1859/ 159576 | consumed samples: 29744 | elapsed time per iteration (ms): 13672.4 | learning rate: 8.246E-06 | global batch size: 16 | lm loss: 7.024733E+00 | loss scale: 32768.0 | grad norm: 195136.100 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1860/ 159576 | consumed samples: 29760 | elapsed time per iteration (ms): 13685.6 | learning rate: 8.250E-06 | global batch size: 16 | lm loss: 7.050764E+00 | loss scale: 32768.0 | grad norm: 137697.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1861/ 159576 | consumed samples: 29776 | elapsed time per iteration (ms): 13956.5 | learning rate: 8.254E-06 | global batch size: 16 | lm loss: 7.164598E+00 | loss scale: 32768.0 | grad norm: 186285.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1862/ 159576 | consumed samples: 29792 | elapsed time per iteration (ms): 13801.6 | learning rate: 8.259E-06 | global batch size: 16 | lm loss: 6.982927E+00 | loss scale: 32768.0 | grad norm: 155576.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1863/ 159576 | consumed samples: 29808 | elapsed time per iteration (ms): 13779.0 | learning rate: 8.263E-06 | global batch size: 16 | lm loss: 6.845668E+00 | loss scale: 32768.0 | grad norm: 211290.875 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1864/ 159576 | consumed samples: 29824 | elapsed time per iteration (ms): 13629.6 | learning rate: 8.268E-06 | global batch size: 16 | lm loss: 7.561100E+00 | loss scale: 32768.0 | grad norm: 177907.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1865/ 159576 | consumed samples: 29840 | elapsed time per iteration (ms): 14024.6 | learning rate: 8.272E-06 | global batch size: 16 | lm loss: 7.056180E+00 | loss scale: 32768.0 | grad norm: 132307.729 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1866/ 159576 | consumed samples: 29856 | elapsed time per iteration (ms): 13629.1 | learning rate: 8.277E-06 | global batch size: 16 | lm loss: 7.005206E+00 | loss scale: 32768.0 | grad norm: 140727.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1867/ 159576 | consumed samples: 29872 | elapsed time per iteration (ms): 13680.5 | learning rate: 8.281E-06 | global batch size: 16 | lm loss: 7.008940E+00 | loss scale: 32768.0 | grad norm: 149676.751 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1868/ 159576 | consumed samples: 29888 | elapsed time per iteration (ms): 13661.9 | learning rate: 8.286E-06 | global batch size: 16 | lm loss: 7.154263E+00 | loss scale: 32768.0 | grad norm: 181537.747 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1869/ 159576 | consumed samples: 29904 | elapsed time per iteration (ms): 13705.9 | learning rate: 8.290E-06 | global batch size: 16 | lm loss: 7.144859E+00 | loss scale: 32768.0 | grad norm: 156740.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1870/ 159576 | consumed samples: 29920 | elapsed time per iteration (ms): 13994.0 | learning rate: 8.294E-06 | global batch size: 16 | lm loss: 7.053184E+00 | loss scale: 32768.0 | grad norm: 209836.615 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1871/ 159576 | consumed samples: 29936 | elapsed time per iteration (ms): 13623.9 | learning rate: 8.299E-06 | global batch size: 16 | lm loss: 7.033763E+00 | loss scale: 32768.0 | grad norm: 173327.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1872/ 159576 | consumed samples: 29952 | elapsed time per iteration (ms): 13679.1 | learning rate: 8.303E-06 | global batch size: 16 | lm loss: 6.990786E+00 | loss scale: 32768.0 | grad norm: 281336.242 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1873/ 159576 | consumed samples: 29968 | elapsed time per iteration (ms): 13694.2 | learning rate: 8.308E-06 | global batch size: 16 | lm loss: 7.073781E+00 | loss scale: 32768.0 | grad norm: 124900.526 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1874/ 159576 | consumed samples: 29984 | elapsed time per iteration (ms): 13905.9 | learning rate: 8.312E-06 | global batch size: 16 | lm loss: 7.112270E+00 | loss scale: 32768.0 | grad norm: 168221.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1875/ 159576 | consumed samples: 30000 | elapsed time per iteration (ms): 13703.7 | learning rate: 8.317E-06 | global batch size: 16 | lm loss: 7.233196E+00 | loss scale: 32768.0 | grad norm: 174650.162 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1876/ 159576 | consumed samples: 30016 | elapsed time per iteration (ms): 13702.9 | learning rate: 8.321E-06 | global batch size: 16 | lm loss: 6.967190E+00 | loss scale: 32768.0 | grad norm: 177533.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1877/ 159576 | consumed samples: 30032 | elapsed time per iteration (ms): 13717.8 | learning rate: 8.325E-06 | global batch size: 16 | lm loss: 7.208225E+00 | loss scale: 32768.0 | grad norm: 207887.332 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1878/ 159576 | consumed samples: 30048 | elapsed time per iteration (ms): 14066.9 | learning rate: 8.330E-06 | global batch size: 16 | lm loss: 7.077339E+00 | loss scale: 32768.0 | grad norm: 142338.907 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1879/ 159576 | consumed samples: 30064 | elapsed time per iteration (ms): 13776.6 | learning rate: 8.334E-06 | global batch size: 16 | lm loss: 7.113251E+00 | loss scale: 32768.0 | grad norm: 158300.777 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1880/ 159576 | consumed samples: 30080 | elapsed time per iteration (ms): 13663.2 | learning rate: 8.339E-06 | global batch size: 16 | lm loss: 6.912469E+00 | loss scale: 32768.0 | grad norm: 145353.873 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1881/ 159576 | consumed samples: 30096 | elapsed time per iteration (ms): 13679.1 | learning rate: 8.343E-06 | global batch size: 16 | lm loss: 7.055939E+00 | loss scale: 32768.0 | grad norm: 337973.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1882/ 159576 | consumed samples: 30112 | elapsed time per iteration (ms): 13654.4 | learning rate: 8.348E-06 | global batch size: 16 | lm loss: 6.903512E+00 | loss scale: 32768.0 | grad norm: 240165.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1883/ 159576 | consumed samples: 30128 | elapsed time per iteration (ms): 13896.8 | learning rate: 8.352E-06 | global batch size: 16 | lm loss: 7.154733E+00 | loss scale: 32768.0 | grad norm: 145006.968 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1884/ 159576 | consumed samples: 30144 | elapsed time per iteration (ms): 13729.5 | learning rate: 8.357E-06 | global batch size: 16 | lm loss: 7.018287E+00 | loss scale: 32768.0 | grad norm: 447058.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1885/ 159576 | consumed samples: 30160 | elapsed time per iteration (ms): 13624.7 | learning rate: 8.361E-06 | global batch size: 16 | lm loss: 7.306771E+00 | loss scale: 32768.0 | grad norm: 269279.686 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1886/ 159576 | consumed samples: 30176 | elapsed time per iteration (ms): 13710.2 | learning rate: 8.365E-06 | global batch size: 16 | lm loss: 7.124641E+00 | loss scale: 32768.0 | grad norm: 184189.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1887/ 159576 | consumed samples: 30192 | elapsed time per iteration (ms): 14269.7 | learning rate: 8.370E-06 | global batch size: 16 | lm loss: 7.147641E+00 | loss scale: 32768.0 | grad norm: 240777.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1888/ 159576 | consumed samples: 30208 | elapsed time per iteration (ms): 13668.8 | learning rate: 8.374E-06 | global batch size: 16 | lm loss: 7.246544E+00 | loss scale: 32768.0 | grad norm: 221768.309 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1889/ 159576 | consumed samples: 30224 | elapsed time per iteration (ms): 13682.0 | learning rate: 8.379E-06 | global batch size: 16 | lm loss: 7.042133E+00 | loss scale: 32768.0 | grad norm: 453492.686 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1890/ 159576 | consumed samples: 30240 | elapsed time per iteration (ms): 13683.0 | learning rate: 8.383E-06 | global batch size: 16 | lm loss: 7.161106E+00 | loss scale: 32768.0 | grad norm: 191134.227 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1891/ 159576 | consumed samples: 30256 | elapsed time per iteration (ms): 14045.3 | learning rate: 8.388E-06 | global batch size: 16 | lm loss: 7.080533E+00 | loss scale: 32768.0 | grad norm: 226207.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1892/ 159576 | consumed samples: 30272 | elapsed time per iteration (ms): 13740.4 | learning rate: 8.392E-06 | global batch size: 16 | lm loss: 6.948812E+00 | loss scale: 32768.0 | grad norm: 198329.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1893/ 159576 | consumed samples: 30288 | elapsed time per iteration (ms): 13747.4 | learning rate: 8.396E-06 | global batch size: 16 | lm loss: 7.024124E+00 | loss scale: 32768.0 | grad norm: 332574.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1894/ 159576 | consumed samples: 30304 | elapsed time per iteration (ms): 13742.5 | learning rate: 8.401E-06 | global batch size: 16 | lm loss: 7.072248E+00 | loss scale: 32768.0 | grad norm: 351090.950 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1895/ 159576 | consumed samples: 30320 | elapsed time per iteration (ms): 13599.9 | learning rate: 8.405E-06 | global batch size: 16 | lm loss: 6.964484E+00 | loss scale: 32768.0 | grad norm: 180676.580 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1896/ 159576 | consumed samples: 30336 | elapsed time per iteration (ms): 13892.1 | learning rate: 8.410E-06 | global batch size: 16 | lm loss: 7.066601E+00 | loss scale: 32768.0 | grad norm: 186229.787 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1897/ 159576 | consumed samples: 30352 | elapsed time per iteration (ms): 13686.6 | learning rate: 8.414E-06 | global batch size: 16 | lm loss: 6.975677E+00 | loss scale: 32768.0 | grad norm: 145844.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1898/ 159576 | consumed samples: 30368 | elapsed time per iteration (ms): 13668.1 | learning rate: 8.419E-06 | global batch size: 16 | lm loss: 7.225606E+00 | loss scale: 32768.0 | grad norm: 229819.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1899/ 159576 | consumed samples: 30384 | elapsed time per iteration (ms): 13600.0 | learning rate: 8.423E-06 | global batch size: 16 | lm loss: 7.082514E+00 | loss scale: 32768.0 | grad norm: 185081.109 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1900/ 159576 | consumed samples: 30400 | elapsed time per iteration (ms): 14001.2 | learning rate: 8.428E-06 | global batch size: 16 | lm loss: 7.021253E+00 | loss scale: 32768.0 | grad norm: 220377.192 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1901/ 159576 | consumed samples: 30416 | elapsed time per iteration (ms): 13722.2 | learning rate: 8.432E-06 | global batch size: 16 | lm loss: 7.049896E+00 | loss scale: 32768.0 | grad norm: 166889.016 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1902/ 159576 | consumed samples: 30432 | elapsed time per iteration (ms): 13621.3 | learning rate: 8.436E-06 | global batch size: 16 | lm loss: 6.878879E+00 | loss scale: 32768.0 | grad norm: 145213.866 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1903/ 159576 | consumed samples: 30448 | elapsed time per iteration (ms): 13693.3 | learning rate: 8.441E-06 | global batch size: 16 | lm loss: 6.981446E+00 | loss scale: 32768.0 | grad norm: 385714.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1904/ 159576 | consumed samples: 30464 | elapsed time per iteration (ms): 13924.8 | learning rate: 8.445E-06 | global batch size: 16 | lm loss: 7.065192E+00 | loss scale: 32768.0 | grad norm: 230309.474 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1905/ 159576 | consumed samples: 30480 | elapsed time per iteration (ms): 13762.9 | learning rate: 8.450E-06 | global batch size: 16 | lm loss: 7.016763E+00 | loss scale: 32768.0 | grad norm: 164701.451 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1906/ 159576 | consumed samples: 30496 | elapsed time per iteration (ms): 13644.6 | learning rate: 8.454E-06 | global batch size: 16 | lm loss: 6.935023E+00 | loss scale: 32768.0 | grad norm: 158636.532 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1907/ 159576 | consumed samples: 30512 | elapsed time per iteration (ms): 13659.2 | learning rate: 8.459E-06 | global batch size: 16 | lm loss: 7.008549E+00 | loss scale: 32768.0 | grad norm: 216415.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1908/ 159576 | consumed samples: 30528 | elapsed time per iteration (ms): 13777.8 | learning rate: 8.463E-06 | global batch size: 16 | lm loss: 7.210999E+00 | loss scale: 32768.0 | grad norm: 201609.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1909/ 159576 | consumed samples: 30544 | elapsed time per iteration (ms): 13647.1 | learning rate: 8.467E-06 | global batch size: 16 | lm loss: 7.035434E+00 | loss scale: 32768.0 | grad norm: 157381.108 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1910/ 159576 | consumed samples: 30560 | elapsed time per iteration (ms): 13657.7 | learning rate: 8.472E-06 | global batch size: 16 | lm loss: 7.002993E+00 | loss scale: 32768.0 | grad norm: 137094.187 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1911/ 159576 | consumed samples: 30576 | elapsed time per iteration (ms): 13538.8 | learning rate: 8.476E-06 | global batch size: 16 | lm loss: 6.895042E+00 | loss scale: 32768.0 | grad norm: 201565.995 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1912/ 159576 | consumed samples: 30592 | elapsed time per iteration (ms): 13570.4 | learning rate: 8.481E-06 | global batch size: 16 | lm loss: 7.119932E+00 | loss scale: 32768.0 | grad norm: 191020.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1913/ 159576 | consumed samples: 30608 | elapsed time per iteration (ms): 13960.8 | learning rate: 8.485E-06 | global batch size: 16 | lm loss: 7.021863E+00 | loss scale: 32768.0 | grad norm: 163947.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1914/ 159576 | consumed samples: 30624 | elapsed time per iteration (ms): 13571.3 | learning rate: 8.490E-06 | global batch size: 16 | lm loss: 7.255896E+00 | loss scale: 32768.0 | grad norm: 110811.833 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1915/ 159576 | consumed samples: 30640 | elapsed time per iteration (ms): 13592.9 | learning rate: 8.494E-06 | global batch size: 16 | lm loss: 7.058972E+00 | loss scale: 32768.0 | grad norm: 226666.177 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1916/ 159576 | consumed samples: 30656 | elapsed time per iteration (ms): 13559.3 | learning rate: 8.499E-06 | global batch size: 16 | lm loss: 7.001413E+00 | loss scale: 32768.0 | grad norm: 155562.702 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1917/ 159576 | consumed samples: 30672 | elapsed time per iteration (ms): 13603.1 | learning rate: 8.503E-06 | global batch size: 16 | lm loss: 6.925358E+00 | loss scale: 32768.0 | grad norm: 153599.875 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1918/ 159576 | consumed samples: 30688 | elapsed time per iteration (ms): 13848.6 | learning rate: 8.507E-06 | global batch size: 16 | lm loss: 7.013722E+00 | loss scale: 32768.0 | grad norm: 151847.788 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1919/ 159576 | consumed samples: 30704 | elapsed time per iteration (ms): 13580.7 | learning rate: 8.512E-06 | global batch size: 16 | lm loss: 7.057837E+00 | loss scale: 32768.0 | grad norm: 149268.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1920/ 159576 | consumed samples: 30720 | elapsed time per iteration (ms): 13579.6 | learning rate: 8.516E-06 | global batch size: 16 | lm loss: 7.059657E+00 | loss scale: 32768.0 | grad norm: 211843.149 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1921/ 159576 | consumed samples: 30736 | elapsed time per iteration (ms): 13716.2 | learning rate: 8.521E-06 | global batch size: 16 | lm loss: 7.145122E+00 | loss scale: 32768.0 | grad norm: 158831.213 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1922/ 159576 | consumed samples: 30752 | elapsed time per iteration (ms): 14204.8 | learning rate: 8.525E-06 | global batch size: 16 | lm loss: 7.012016E+00 | loss scale: 32768.0 | grad norm: 142219.675 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1923/ 159576 | consumed samples: 30768 | elapsed time per iteration (ms): 13586.3 | learning rate: 8.530E-06 | global batch size: 16 | lm loss: 6.958722E+00 | loss scale: 32768.0 | grad norm: 147958.053 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1924/ 159576 | consumed samples: 30784 | elapsed time per iteration (ms): 13654.4 | learning rate: 8.534E-06 | global batch size: 16 | lm loss: 6.916204E+00 | loss scale: 32768.0 | grad norm: 168316.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1925/ 159576 | consumed samples: 30800 | elapsed time per iteration (ms): 13581.4 | learning rate: 8.538E-06 | global batch size: 16 | lm loss: 7.208139E+00 | loss scale: 32768.0 | grad norm: 186895.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1926/ 159576 | consumed samples: 30816 | elapsed time per iteration (ms): 14057.7 | learning rate: 8.543E-06 | global batch size: 16 | lm loss: 6.921901E+00 | loss scale: 32768.0 | grad norm: 136886.936 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1927/ 159576 | consumed samples: 30832 | elapsed time per iteration (ms): 13553.3 | learning rate: 8.547E-06 | global batch size: 16 | lm loss: 7.044703E+00 | loss scale: 32768.0 | grad norm: 318519.845 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1928/ 159576 | consumed samples: 30848 | elapsed time per iteration (ms): 13594.1 | learning rate: 8.552E-06 | global batch size: 16 | lm loss: 6.906800E+00 | loss scale: 32768.0 | grad norm: 155021.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1929/ 159576 | consumed samples: 30864 | elapsed time per iteration (ms): 13607.1 | learning rate: 8.556E-06 | global batch size: 16 | lm loss: 6.881465E+00 | loss scale: 32768.0 | grad norm: 190717.011 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1930/ 159576 | consumed samples: 30880 | elapsed time per iteration (ms): 13551.6 | learning rate: 8.561E-06 | global batch size: 16 | lm loss: 7.199529E+00 | loss scale: 32768.0 | grad norm: 191859.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1931/ 159576 | consumed samples: 30896 | elapsed time per iteration (ms): 13806.2 | learning rate: 8.565E-06 | global batch size: 16 | lm loss: 6.954100E+00 | loss scale: 32768.0 | grad norm: 130775.699 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1932/ 159576 | consumed samples: 30912 | elapsed time per iteration (ms): 13613.1 | learning rate: 8.570E-06 | global batch size: 16 | lm loss: 6.704428E+00 | loss scale: 32768.0 | grad norm: 137607.979 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1933/ 159576 | consumed samples: 30928 | elapsed time per iteration (ms): 13506.4 | learning rate: 8.574E-06 | global batch size: 16 | lm loss: 7.014212E+00 | loss scale: 32768.0 | grad norm: 186579.256 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1934/ 159576 | consumed samples: 30944 | elapsed time per iteration (ms): 13520.6 | learning rate: 8.578E-06 | global batch size: 16 | lm loss: 7.012688E+00 | loss scale: 32768.0 | grad norm: 155464.251 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1935/ 159576 | consumed samples: 30960 | elapsed time per iteration (ms): 13855.4 | learning rate: 8.583E-06 | global batch size: 16 | lm loss: 7.011374E+00 | loss scale: 32768.0 | grad norm: 128570.064 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1936/ 159576 | consumed samples: 30976 | elapsed time per iteration (ms): 13483.8 | learning rate: 8.587E-06 | global batch size: 16 | lm loss: 6.823971E+00 | loss scale: 32768.0 | grad norm: 185286.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1937/ 159576 | consumed samples: 30992 | elapsed time per iteration (ms): 13455.5 | learning rate: 8.592E-06 | global batch size: 16 | lm loss: 7.002713E+00 | loss scale: 32768.0 | grad norm: 168834.944 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1938/ 159576 | consumed samples: 31008 | elapsed time per iteration (ms): 13488.7 | learning rate: 8.596E-06 | global batch size: 16 | lm loss: 7.308265E+00 | loss scale: 32768.0 | grad norm: 113334.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1939/ 159576 | consumed samples: 31024 | elapsed time per iteration (ms): 13517.8 | learning rate: 8.601E-06 | global batch size: 16 | lm loss: 6.832065E+00 | loss scale: 32768.0 | grad norm: 143617.951 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1940/ 159576 | consumed samples: 31040 | elapsed time per iteration (ms): 13777.8 | learning rate: 8.605E-06 | global batch size: 16 | lm loss: 6.758460E+00 | loss scale: 32768.0 | grad norm: 131000.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1941/ 159576 | consumed samples: 31056 | elapsed time per iteration (ms): 13526.9 | learning rate: 8.609E-06 | global batch size: 16 | lm loss: 6.587332E+00 | loss scale: 32768.0 | grad norm: 133270.011 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1942/ 159576 | consumed samples: 31072 | elapsed time per iteration (ms): 13522.3 | learning rate: 8.614E-06 | global batch size: 16 | lm loss: 7.005889E+00 | loss scale: 32768.0 | grad norm: 169934.736 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1943/ 159576 | consumed samples: 31088 | elapsed time per iteration (ms): 13505.7 | learning rate: 8.618E-06 | global batch size: 16 | lm loss: 7.113358E+00 | loss scale: 32768.0 | grad norm: 147469.388 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1944/ 159576 | consumed samples: 31104 | elapsed time per iteration (ms): 14004.8 | learning rate: 8.623E-06 | global batch size: 16 | lm loss: 6.815184E+00 | loss scale: 32768.0 | grad norm: 129420.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1945/ 159576 | consumed samples: 31120 | elapsed time per iteration (ms): 13536.0 | learning rate: 8.627E-06 | global batch size: 16 | lm loss: 6.802580E+00 | loss scale: 32768.0 | grad norm: 206454.023 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1946/ 159576 | consumed samples: 31136 | elapsed time per iteration (ms): 13571.2 | learning rate: 8.632E-06 | global batch size: 16 | lm loss: 6.899452E+00 | loss scale: 32768.0 | grad norm: 159625.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1947/ 159576 | consumed samples: 31152 | elapsed time per iteration (ms): 13512.7 | learning rate: 8.636E-06 | global batch size: 16 | lm loss: 6.902468E+00 | loss scale: 32768.0 | grad norm: 161374.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1948/ 159576 | consumed samples: 31168 | elapsed time per iteration (ms): 13965.3 | learning rate: 8.641E-06 | global batch size: 16 | lm loss: 7.027518E+00 | loss scale: 32768.0 | grad norm: 141898.251 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1949/ 159576 | consumed samples: 31184 | elapsed time per iteration (ms): 13617.6 | learning rate: 8.645E-06 | global batch size: 16 | lm loss: 6.901030E+00 | loss scale: 32768.0 | grad norm: 115156.669 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1950/ 159576 | consumed samples: 31200 | elapsed time per iteration (ms): 13549.7 | learning rate: 8.649E-06 | global batch size: 16 | lm loss: 7.012411E+00 | loss scale: 32768.0 | grad norm: 364327.043 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1951/ 159576 | consumed samples: 31216 | elapsed time per iteration (ms): 13460.7 | learning rate: 8.654E-06 | global batch size: 16 | lm loss: 6.996010E+00 | loss scale: 32768.0 | grad norm: 265923.298 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1952/ 159576 | consumed samples: 31232 | elapsed time per iteration (ms): 13574.9 | learning rate: 8.658E-06 | global batch size: 16 | lm loss: 7.002955E+00 | loss scale: 32768.0 | grad norm: 147080.962 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1953/ 159576 | consumed samples: 31248 | elapsed time per iteration (ms): 13782.5 | learning rate: 8.663E-06 | global batch size: 16 | lm loss: 6.930263E+00 | loss scale: 32768.0 | grad norm: 190217.592 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1954/ 159576 | consumed samples: 31264 | elapsed time per iteration (ms): 13515.2 | learning rate: 8.667E-06 | global batch size: 16 | lm loss: 6.835277E+00 | loss scale: 32768.0 | grad norm: 254678.528 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1955/ 159576 | consumed samples: 31280 | elapsed time per iteration (ms): 13569.3 | learning rate: 8.672E-06 | global batch size: 16 | lm loss: 7.283230E+00 | loss scale: 32768.0 | grad norm: 137167.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1956/ 159576 | consumed samples: 31296 | elapsed time per iteration (ms): 13592.0 | learning rate: 8.676E-06 | global batch size: 16 | lm loss: 6.895840E+00 | loss scale: 32768.0 | grad norm: 198657.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1957/ 159576 | consumed samples: 31312 | elapsed time per iteration (ms): 13906.4 | learning rate: 8.680E-06 | global batch size: 16 | lm loss: 7.127283E+00 | loss scale: 32768.0 | grad norm: 242163.922 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1958/ 159576 | consumed samples: 31328 | elapsed time per iteration (ms): 13647.9 | learning rate: 8.685E-06 | global batch size: 16 | lm loss: 7.022318E+00 | loss scale: 32768.0 | grad norm: 179227.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1959/ 159576 | consumed samples: 31344 | elapsed time per iteration (ms): 13668.0 | learning rate: 8.689E-06 | global batch size: 16 | lm loss: 7.021772E+00 | loss scale: 32768.0 | grad norm: 223437.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1960/ 159576 | consumed samples: 31360 | elapsed time per iteration (ms): 13699.2 | learning rate: 8.694E-06 | global batch size: 16 | lm loss: 7.270517E+00 | loss scale: 32768.0 | grad norm: 166965.849 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1961/ 159576 | consumed samples: 31376 | elapsed time per iteration (ms): 13595.5 | learning rate: 8.698E-06 | global batch size: 16 | lm loss: 6.963766E+00 | loss scale: 32768.0 | grad norm: 257581.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1962/ 159576 | consumed samples: 31392 | elapsed time per iteration (ms): 13818.3 | learning rate: 8.703E-06 | global batch size: 16 | lm loss: 6.847409E+00 | loss scale: 32768.0 | grad norm: 162709.033 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1963/ 159576 | consumed samples: 31408 | elapsed time per iteration (ms): 13645.3 | learning rate: 8.707E-06 | global batch size: 16 | lm loss: 6.902783E+00 | loss scale: 32768.0 | grad norm: 186486.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1964/ 159576 | consumed samples: 31424 | elapsed time per iteration (ms): 13637.0 | learning rate: 8.712E-06 | global batch size: 16 | lm loss: 7.112407E+00 | loss scale: 32768.0 | grad norm: 234566.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1965/ 159576 | consumed samples: 31440 | elapsed time per iteration (ms): 13632.5 | learning rate: 8.716E-06 | global batch size: 16 | lm loss: 6.965158E+00 | loss scale: 32768.0 | grad norm: 162405.643 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1966/ 159576 | consumed samples: 31456 | elapsed time per iteration (ms): 13923.2 | learning rate: 8.720E-06 | global batch size: 16 | lm loss: 7.162685E+00 | loss scale: 32768.0 | grad norm: 160740.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1967/ 159576 | consumed samples: 31472 | elapsed time per iteration (ms): 13722.5 | learning rate: 8.725E-06 | global batch size: 16 | lm loss: 6.822609E+00 | loss scale: 32768.0 | grad norm: 163162.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1968/ 159576 | consumed samples: 31488 | elapsed time per iteration (ms): 13559.9 | learning rate: 8.729E-06 | global batch size: 16 | lm loss: 6.829067E+00 | loss scale: 32768.0 | grad norm: 148991.615 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1969/ 159576 | consumed samples: 31504 | elapsed time per iteration (ms): 13640.6 | learning rate: 8.734E-06 | global batch size: 16 | lm loss: 6.753247E+00 | loss scale: 32768.0 | grad norm: 174635.290 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1970/ 159576 | consumed samples: 31520 | elapsed time per iteration (ms): 13996.0 | learning rate: 8.738E-06 | global batch size: 16 | lm loss: 7.113372E+00 | loss scale: 32768.0 | grad norm: 278150.407 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1971/ 159576 | consumed samples: 31536 | elapsed time per iteration (ms): 13669.9 | learning rate: 8.743E-06 | global batch size: 16 | lm loss: 6.872749E+00 | loss scale: 32768.0 | grad norm: 176866.437 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1972/ 159576 | consumed samples: 31552 | elapsed time per iteration (ms): 13634.0 | learning rate: 8.747E-06 | global batch size: 16 | lm loss: 6.944706E+00 | loss scale: 32768.0 | grad norm: 145690.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1973/ 159576 | consumed samples: 31568 | elapsed time per iteration (ms): 13676.3 | learning rate: 8.751E-06 | global batch size: 16 | lm loss: 7.106283E+00 | loss scale: 32768.0 | grad norm: 154568.562 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1974/ 159576 | consumed samples: 31584 | elapsed time per iteration (ms): 13610.0 | learning rate: 8.756E-06 | global batch size: 16 | lm loss: 7.001073E+00 | loss scale: 32768.0 | grad norm: 156908.897 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1975/ 159576 | consumed samples: 31600 | elapsed time per iteration (ms): 13727.1 | learning rate: 8.760E-06 | global batch size: 16 | lm loss: 7.050818E+00 | loss scale: 32768.0 | grad norm: 234696.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1976/ 159576 | consumed samples: 31616 | elapsed time per iteration (ms): 13612.3 | learning rate: 8.765E-06 | global batch size: 16 | lm loss: 7.084875E+00 | loss scale: 32768.0 | grad norm: 169650.883 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1977/ 159576 | consumed samples: 31632 | elapsed time per iteration (ms): 13652.4 | learning rate: 8.769E-06 | global batch size: 16 | lm loss: 6.942274E+00 | loss scale: 32768.0 | grad norm: 133422.940 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1978/ 159576 | consumed samples: 31648 | elapsed time per iteration (ms): 13598.6 | learning rate: 8.774E-06 | global batch size: 16 | lm loss: 7.020503E+00 | loss scale: 32768.0 | grad norm: 191046.458 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1979/ 159576 | consumed samples: 31664 | elapsed time per iteration (ms): 6793.7 | learning rate: 8.774E-06 | global batch size: 16 | lm loss: 7.205068E+00 | loss scale: 16384.0 | grad norm: 191046.458 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1980/ 159576 | consumed samples: 31680 | elapsed time per iteration (ms): 13294.9 | learning rate: 8.778E-06 | global batch size: 16 | lm loss: 6.981399E+00 | loss scale: 16384.0 | grad norm: 88750.748 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1981/ 159576 | consumed samples: 31696 | elapsed time per iteration (ms): 13611.4 | learning rate: 8.783E-06 | global batch size: 16 | lm loss: 7.062120E+00 | loss scale: 16384.0 | grad norm: 98643.338 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1982/ 159576 | consumed samples: 31712 | elapsed time per iteration (ms): 13593.8 | learning rate: 8.787E-06 | global batch size: 16 | lm loss: 6.878181E+00 | loss scale: 16384.0 | grad norm: 67555.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1983/ 159576 | consumed samples: 31728 | elapsed time per iteration (ms): 13656.6 | learning rate: 8.791E-06 | global batch size: 16 | lm loss: 6.958256E+00 | loss scale: 16384.0 | grad norm: 79163.237 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1984/ 159576 | consumed samples: 31744 | elapsed time per iteration (ms): 13863.2 | learning rate: 8.796E-06 | global batch size: 16 | lm loss: 6.850488E+00 | loss scale: 16384.0 | grad norm: 49908.825 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1985/ 159576 | consumed samples: 31760 | elapsed time per iteration (ms): 13625.0 | learning rate: 8.800E-06 | global batch size: 16 | lm loss: 7.227520E+00 | loss scale: 16384.0 | grad norm: 56779.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1986/ 159576 | consumed samples: 31776 | elapsed time per iteration (ms): 13644.4 | learning rate: 8.805E-06 | global batch size: 16 | lm loss: 7.002261E+00 | loss scale: 16384.0 | grad norm: 88929.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1987/ 159576 | consumed samples: 31792 | elapsed time per iteration (ms): 13690.4 | learning rate: 8.809E-06 | global batch size: 16 | lm loss: 7.085162E+00 | loss scale: 16384.0 | grad norm: 50454.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1988/ 159576 | consumed samples: 31808 | elapsed time per iteration (ms): 13934.9 | learning rate: 8.814E-06 | global batch size: 16 | lm loss: 6.948382E+00 | loss scale: 16384.0 | grad norm: 95360.624 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1989/ 159576 | consumed samples: 31824 | elapsed time per iteration (ms): 13779.2 | learning rate: 8.818E-06 | global batch size: 16 | lm loss: 6.810514E+00 | loss scale: 16384.0 | grad norm: 64656.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1990/ 159576 | consumed samples: 31840 | elapsed time per iteration (ms): 13639.8 | learning rate: 8.822E-06 | global batch size: 16 | lm loss: 6.904098E+00 | loss scale: 16384.0 | grad norm: 77126.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1991/ 159576 | consumed samples: 31856 | elapsed time per iteration (ms): 13559.7 | learning rate: 8.827E-06 | global batch size: 16 | lm loss: 6.833849E+00 | loss scale: 16384.0 | grad norm: 68875.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1992/ 159576 | consumed samples: 31872 | elapsed time per iteration (ms): 13602.8 | learning rate: 8.831E-06 | global batch size: 16 | lm loss: 6.989305E+00 | loss scale: 16384.0 | grad norm: 77647.510 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1993/ 159576 | consumed samples: 31888 | elapsed time per iteration (ms): 13976.7 | learning rate: 8.836E-06 | global batch size: 16 | lm loss: 6.928751E+00 | loss scale: 16384.0 | grad norm: 67757.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1994/ 159576 | consumed samples: 31904 | elapsed time per iteration (ms): 13704.1 | learning rate: 8.840E-06 | global batch size: 16 | lm loss: 6.835466E+00 | loss scale: 16384.0 | grad norm: 69187.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1995/ 159576 | consumed samples: 31920 | elapsed time per iteration (ms): 13650.9 | learning rate: 8.845E-06 | global batch size: 16 | lm loss: 7.294861E+00 | loss scale: 16384.0 | grad norm: 143539.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1996/ 159576 | consumed samples: 31936 | elapsed time per iteration (ms): 13627.5 | learning rate: 8.849E-06 | global batch size: 16 | lm loss: 7.121392E+00 | loss scale: 16384.0 | grad norm: 74325.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1997/ 159576 | consumed samples: 31952 | elapsed time per iteration (ms): 13965.6 | learning rate: 8.854E-06 | global batch size: 16 | lm loss: 7.100244E+00 | loss scale: 16384.0 | grad norm: 92130.836 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1998/ 159576 | consumed samples: 31968 | elapsed time per iteration (ms): 13575.4 | learning rate: 8.858E-06 | global batch size: 16 | lm loss: 7.115794E+00 | loss scale: 16384.0 | grad norm: 85348.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 1999/ 159576 | consumed samples: 31984 | elapsed time per iteration (ms): 13570.5 | learning rate: 8.862E-06 | global batch size: 16 | lm loss: 6.867460E+00 | loss scale: 16384.0 | grad norm: 112217.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 09:55:59,653] [INFO] [logging.py:68:log_dist] [Rank 0] step=2000, skipped=2, lr=[8.866863905325443e-06, 8.866863905325443e-06], mom=[(0.9, 0.999), (0.9, 0.999)] -steps: 2000 loss: 6.9298 iter time (s): 0.007 samples/sec: 2456.816 - iteration 2000/ 159576 | consumed samples: 32000 | elapsed time per iteration (ms): 13633.7 | learning rate: 8.867E-06 | global batch size: 16 | lm loss: 6.929829E+00 | loss scale: 16384.0 | grad norm: 73846.467 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 2000 | lm loss value: 7.084489E+00 | lm loss PPL: 1.193313E+03 | ------------------------------------------------------------------------------------------------- - iteration 2001/ 159576 | consumed samples: 32016 | elapsed time per iteration (ms): 18999.4 | learning rate: 8.871E-06 | global batch size: 16 | lm loss: 6.882600E+00 | loss scale: 16384.0 | grad norm: 132358.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2002/ 159576 | consumed samples: 32032 | elapsed time per iteration (ms): 13626.5 | learning rate: 8.876E-06 | global batch size: 16 | lm loss: 7.231313E+00 | loss scale: 16384.0 | grad norm: 139453.166 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2003/ 159576 | consumed samples: 32048 | elapsed time per iteration (ms): 13687.4 | learning rate: 8.880E-06 | global batch size: 16 | lm loss: 7.034769E+00 | loss scale: 16384.0 | grad norm: 74117.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2004/ 159576 | consumed samples: 32064 | elapsed time per iteration (ms): 13579.3 | learning rate: 8.885E-06 | global batch size: 16 | lm loss: 7.053939E+00 | loss scale: 16384.0 | grad norm: 185455.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2005/ 159576 | consumed samples: 32080 | elapsed time per iteration (ms): 13617.6 | learning rate: 8.889E-06 | global batch size: 16 | lm loss: 6.871277E+00 | loss scale: 16384.0 | grad norm: 117343.684 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2006/ 159576 | consumed samples: 32096 | elapsed time per iteration (ms): 13892.7 | learning rate: 8.893E-06 | global batch size: 16 | lm loss: 6.839181E+00 | loss scale: 16384.0 | grad norm: 77619.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2007/ 159576 | consumed samples: 32112 | elapsed time per iteration (ms): 13580.2 | learning rate: 8.898E-06 | global batch size: 16 | lm loss: 7.031313E+00 | loss scale: 16384.0 | grad norm: 111506.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2008/ 159576 | consumed samples: 32128 | elapsed time per iteration (ms): 13652.0 | learning rate: 8.902E-06 | global batch size: 16 | lm loss: 6.763354E+00 | loss scale: 16384.0 | grad norm: 74284.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2009/ 159576 | consumed samples: 32144 | elapsed time per iteration (ms): 13663.9 | learning rate: 8.907E-06 | global batch size: 16 | lm loss: 7.173141E+00 | loss scale: 16384.0 | grad norm: 176920.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2010/ 159576 | consumed samples: 32160 | elapsed time per iteration (ms): 14071.2 | learning rate: 8.911E-06 | global batch size: 16 | lm loss: 6.940368E+00 | loss scale: 16384.0 | grad norm: 136609.771 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2011/ 159576 | consumed samples: 32176 | elapsed time per iteration (ms): 13641.6 | learning rate: 8.916E-06 | global batch size: 16 | lm loss: 7.348205E+00 | loss scale: 16384.0 | grad norm: 74685.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2012/ 159576 | consumed samples: 32192 | elapsed time per iteration (ms): 13599.3 | learning rate: 8.920E-06 | global batch size: 16 | lm loss: 6.813260E+00 | loss scale: 16384.0 | grad norm: 98269.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2013/ 159576 | consumed samples: 32208 | elapsed time per iteration (ms): 13658.0 | learning rate: 8.925E-06 | global batch size: 16 | lm loss: 7.088203E+00 | loss scale: 16384.0 | grad norm: 67591.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2014/ 159576 | consumed samples: 32224 | elapsed time per iteration (ms): 14073.3 | learning rate: 8.929E-06 | global batch size: 16 | lm loss: 6.925144E+00 | loss scale: 16384.0 | grad norm: 125518.891 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2015/ 159576 | consumed samples: 32240 | elapsed time per iteration (ms): 13531.4 | learning rate: 8.933E-06 | global batch size: 16 | lm loss: 7.150875E+00 | loss scale: 16384.0 | grad norm: 145833.664 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2016/ 159576 | consumed samples: 32256 | elapsed time per iteration (ms): 13718.9 | learning rate: 8.938E-06 | global batch size: 16 | lm loss: 7.058916E+00 | loss scale: 16384.0 | grad norm: 104576.621 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2017/ 159576 | consumed samples: 32272 | elapsed time per iteration (ms): 13660.3 | learning rate: 8.942E-06 | global batch size: 16 | lm loss: 7.075126E+00 | loss scale: 16384.0 | grad norm: 68969.823 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2018/ 159576 | consumed samples: 32288 | elapsed time per iteration (ms): 13657.9 | learning rate: 8.947E-06 | global batch size: 16 | lm loss: 7.021468E+00 | loss scale: 16384.0 | grad norm: 102873.081 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2019/ 159576 | consumed samples: 32304 | elapsed time per iteration (ms): 13864.5 | learning rate: 8.951E-06 | global batch size: 16 | lm loss: 7.182456E+00 | loss scale: 16384.0 | grad norm: 83098.867 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2020/ 159576 | consumed samples: 32320 | elapsed time per iteration (ms): 13595.8 | learning rate: 8.956E-06 | global batch size: 16 | lm loss: 7.201014E+00 | loss scale: 16384.0 | grad norm: 86577.891 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2021/ 159576 | consumed samples: 32336 | elapsed time per iteration (ms): 13656.2 | learning rate: 8.960E-06 | global batch size: 16 | lm loss: 7.021406E+00 | loss scale: 16384.0 | grad norm: 81681.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2022/ 159576 | consumed samples: 32352 | elapsed time per iteration (ms): 13573.2 | learning rate: 8.964E-06 | global batch size: 16 | lm loss: 7.084285E+00 | loss scale: 16384.0 | grad norm: 87860.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2023/ 159576 | consumed samples: 32368 | elapsed time per iteration (ms): 13983.6 | learning rate: 8.969E-06 | global batch size: 16 | lm loss: 6.934657E+00 | loss scale: 16384.0 | grad norm: 59691.509 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2024/ 159576 | consumed samples: 32384 | elapsed time per iteration (ms): 13601.4 | learning rate: 8.973E-06 | global batch size: 16 | lm loss: 7.007637E+00 | loss scale: 16384.0 | grad norm: 90222.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2025/ 159576 | consumed samples: 32400 | elapsed time per iteration (ms): 13711.5 | learning rate: 8.978E-06 | global batch size: 16 | lm loss: 6.979746E+00 | loss scale: 16384.0 | grad norm: 93849.629 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2026/ 159576 | consumed samples: 32416 | elapsed time per iteration (ms): 13699.6 | learning rate: 8.982E-06 | global batch size: 16 | lm loss: 6.934021E+00 | loss scale: 16384.0 | grad norm: 80041.099 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2027/ 159576 | consumed samples: 32432 | elapsed time per iteration (ms): 14076.1 | learning rate: 8.987E-06 | global batch size: 16 | lm loss: 6.980267E+00 | loss scale: 16384.0 | grad norm: 62895.732 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2028/ 159576 | consumed samples: 32448 | elapsed time per iteration (ms): 13679.2 | learning rate: 8.991E-06 | global batch size: 16 | lm loss: 7.024888E+00 | loss scale: 16384.0 | grad norm: 52171.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2029/ 159576 | consumed samples: 32464 | elapsed time per iteration (ms): 13587.5 | learning rate: 8.996E-06 | global batch size: 16 | lm loss: 7.115479E+00 | loss scale: 16384.0 | grad norm: 102889.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2030/ 159576 | consumed samples: 32480 | elapsed time per iteration (ms): 13601.6 | learning rate: 9.000E-06 | global batch size: 16 | lm loss: 7.058015E+00 | loss scale: 16384.0 | grad norm: 59629.338 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2031/ 159576 | consumed samples: 32496 | elapsed time per iteration (ms): 13586.5 | learning rate: 9.004E-06 | global batch size: 16 | lm loss: 7.114190E+00 | loss scale: 16384.0 | grad norm: 71212.111 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2032/ 159576 | consumed samples: 32512 | elapsed time per iteration (ms): 13640.1 | learning rate: 9.009E-06 | global batch size: 16 | lm loss: 7.060964E+00 | loss scale: 16384.0 | grad norm: 64723.435 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2033/ 159576 | consumed samples: 32528 | elapsed time per iteration (ms): 13600.9 | learning rate: 9.013E-06 | global batch size: 16 | lm loss: 7.134828E+00 | loss scale: 16384.0 | grad norm: 56762.338 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2034/ 159576 | consumed samples: 32544 | elapsed time per iteration (ms): 13742.8 | learning rate: 9.018E-06 | global batch size: 16 | lm loss: 7.147020E+00 | loss scale: 16384.0 | grad norm: 116614.867 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2035/ 159576 | consumed samples: 32560 | elapsed time per iteration (ms): 13462.2 | learning rate: 9.022E-06 | global batch size: 16 | lm loss: 7.059257E+00 | loss scale: 16384.0 | grad norm: 95862.140 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2036/ 159576 | consumed samples: 32576 | elapsed time per iteration (ms): 13758.4 | learning rate: 9.027E-06 | global batch size: 16 | lm loss: 7.181802E+00 | loss scale: 16384.0 | grad norm: 60688.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2037/ 159576 | consumed samples: 32592 | elapsed time per iteration (ms): 13746.2 | learning rate: 9.031E-06 | global batch size: 16 | lm loss: 6.934449E+00 | loss scale: 16384.0 | grad norm: 71317.463 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2038/ 159576 | consumed samples: 32608 | elapsed time per iteration (ms): 13575.3 | learning rate: 9.036E-06 | global batch size: 16 | lm loss: 7.080134E+00 | loss scale: 16384.0 | grad norm: 57102.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2039/ 159576 | consumed samples: 32624 | elapsed time per iteration (ms): 13585.4 | learning rate: 9.040E-06 | global batch size: 16 | lm loss: 6.727631E+00 | loss scale: 16384.0 | grad norm: 74231.007 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2040/ 159576 | consumed samples: 32640 | elapsed time per iteration (ms): 13615.9 | learning rate: 9.044E-06 | global batch size: 16 | lm loss: 6.939786E+00 | loss scale: 16384.0 | grad norm: 75764.732 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.scheduler:Execution of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-24 10:05:20 CEST)" skipped: maximum number of running instances reached (3) - iteration 2041/ 159576 | consumed samples: 32656 | elapsed time per iteration (ms): 14017.0 | learning rate: 9.049E-06 | global batch size: 16 | lm loss: 6.912762E+00 | loss scale: 16384.0 | grad norm: 60359.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2042/ 159576 | consumed samples: 32672 | elapsed time per iteration (ms): 13386.8 | learning rate: 9.053E-06 | global batch size: 16 | lm loss: 6.892349E+00 | loss scale: 16384.0 | grad norm: 68369.533 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 10:05:52] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 10:05:52] PULSE: tr8-104B is running for 4:13:41 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 2043/ 159576 | consumed samples: 32688 | elapsed time per iteration (ms): 13496.3 | learning rate: 9.058E-06 | global batch size: 16 | lm loss: 7.106496E+00 | loss scale: 16384.0 | grad norm: 74847.038 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2044/ 159576 | consumed samples: 32704 | elapsed time per iteration (ms): 13461.5 | learning rate: 9.062E-06 | global batch size: 16 | lm loss: 7.101841E+00 | loss scale: 16384.0 | grad norm: 81326.664 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2045/ 159576 | consumed samples: 32720 | elapsed time per iteration (ms): 14029.5 | learning rate: 9.067E-06 | global batch size: 16 | lm loss: 6.818883E+00 | loss scale: 16384.0 | grad norm: 55780.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2046/ 159576 | consumed samples: 32736 | elapsed time per iteration (ms): 13528.3 | learning rate: 9.071E-06 | global batch size: 16 | lm loss: 7.344654E+00 | loss scale: 16384.0 | grad norm: 85807.867 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2047/ 159576 | consumed samples: 32752 | elapsed time per iteration (ms): 13633.2 | learning rate: 9.075E-06 | global batch size: 16 | lm loss: 7.041794E+00 | loss scale: 16384.0 | grad norm: 68040.665 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2048/ 159576 | consumed samples: 32768 | elapsed time per iteration (ms): 13714.3 | learning rate: 9.080E-06 | global batch size: 16 | lm loss: 7.051764E+00 | loss scale: 16384.0 | grad norm: 54860.412 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2049/ 159576 | consumed samples: 32784 | elapsed time per iteration (ms): 13991.3 | learning rate: 9.084E-06 | global batch size: 16 | lm loss: 6.824497E+00 | loss scale: 16384.0 | grad norm: 71323.543 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2050/ 159576 | consumed samples: 32800 | elapsed time per iteration (ms): 13606.5 | learning rate: 9.089E-06 | global batch size: 16 | lm loss: 7.182322E+00 | loss scale: 16384.0 | grad norm: 85719.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2051/ 159576 | consumed samples: 32816 | elapsed time per iteration (ms): 13580.8 | learning rate: 9.093E-06 | global batch size: 16 | lm loss: 7.293634E+00 | loss scale: 16384.0 | grad norm: 80588.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2052/ 159576 | consumed samples: 32832 | elapsed time per iteration (ms): 13550.0 | learning rate: 9.098E-06 | global batch size: 16 | lm loss: 7.101615E+00 | loss scale: 16384.0 | grad norm: 84442.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2053/ 159576 | consumed samples: 32848 | elapsed time per iteration (ms): 13599.2 | learning rate: 9.102E-06 | global batch size: 16 | lm loss: 7.037670E+00 | loss scale: 16384.0 | grad norm: 66660.579 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2054/ 159576 | consumed samples: 32864 | elapsed time per iteration (ms): 13845.0 | learning rate: 9.107E-06 | global batch size: 16 | lm loss: 7.019003E+00 | loss scale: 16384.0 | grad norm: 62001.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2055/ 159576 | consumed samples: 32880 | elapsed time per iteration (ms): 13669.5 | learning rate: 9.111E-06 | global batch size: 16 | lm loss: 6.911786E+00 | loss scale: 16384.0 | grad norm: 117097.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2056/ 159576 | consumed samples: 32896 | elapsed time per iteration (ms): 13595.0 | learning rate: 9.115E-06 | global batch size: 16 | lm loss: 7.090348E+00 | loss scale: 16384.0 | grad norm: 84113.874 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2057/ 159576 | consumed samples: 32912 | elapsed time per iteration (ms): 13602.9 | learning rate: 9.120E-06 | global batch size: 16 | lm loss: 6.805397E+00 | loss scale: 16384.0 | grad norm: 74285.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2058/ 159576 | consumed samples: 32928 | elapsed time per iteration (ms): 13938.5 | learning rate: 9.124E-06 | global batch size: 16 | lm loss: 7.156925E+00 | loss scale: 16384.0 | grad norm: 123564.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2059/ 159576 | consumed samples: 32944 | elapsed time per iteration (ms): 13535.6 | learning rate: 9.129E-06 | global batch size: 16 | lm loss: 7.097910E+00 | loss scale: 16384.0 | grad norm: 80614.365 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2060/ 159576 | consumed samples: 32960 | elapsed time per iteration (ms): 13561.1 | learning rate: 9.133E-06 | global batch size: 16 | lm loss: 7.173540E+00 | loss scale: 16384.0 | grad norm: 82969.227 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2061/ 159576 | consumed samples: 32976 | elapsed time per iteration (ms): 13641.0 | learning rate: 9.138E-06 | global batch size: 16 | lm loss: 6.963642E+00 | loss scale: 16384.0 | grad norm: 58968.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2062/ 159576 | consumed samples: 32992 | elapsed time per iteration (ms): 13737.9 | learning rate: 9.142E-06 | global batch size: 16 | lm loss: 6.932078E+00 | loss scale: 16384.0 | grad norm: 176037.023 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2063/ 159576 | consumed samples: 33008 | elapsed time per iteration (ms): 13779.6 | learning rate: 9.146E-06 | global batch size: 16 | lm loss: 6.904696E+00 | loss scale: 16384.0 | grad norm: 107303.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2064/ 159576 | consumed samples: 33024 | elapsed time per iteration (ms): 13634.2 | learning rate: 9.151E-06 | global batch size: 16 | lm loss: 6.834531E+00 | loss scale: 16384.0 | grad norm: 100378.838 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2065/ 159576 | consumed samples: 33040 | elapsed time per iteration (ms): 13654.1 | learning rate: 9.155E-06 | global batch size: 16 | lm loss: 7.101809E+00 | loss scale: 16384.0 | grad norm: 100637.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2066/ 159576 | consumed samples: 33056 | elapsed time per iteration (ms): 13496.2 | learning rate: 9.160E-06 | global batch size: 16 | lm loss: 6.822946E+00 | loss scale: 16384.0 | grad norm: 72463.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2067/ 159576 | consumed samples: 33072 | elapsed time per iteration (ms): 14117.2 | learning rate: 9.164E-06 | global batch size: 16 | lm loss: 7.133995E+00 | loss scale: 16384.0 | grad norm: 265928.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2068/ 159576 | consumed samples: 33088 | elapsed time per iteration (ms): 13658.0 | learning rate: 9.169E-06 | global batch size: 16 | lm loss: 7.058832E+00 | loss scale: 16384.0 | grad norm: 225451.637 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2069/ 159576 | consumed samples: 33104 | elapsed time per iteration (ms): 13647.8 | learning rate: 9.173E-06 | global batch size: 16 | lm loss: 6.733691E+00 | loss scale: 16384.0 | grad norm: 109352.478 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2070/ 159576 | consumed samples: 33120 | elapsed time per iteration (ms): 13662.1 | learning rate: 9.178E-06 | global batch size: 16 | lm loss: 7.330385E+00 | loss scale: 16384.0 | grad norm: 106190.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2071/ 159576 | consumed samples: 33136 | elapsed time per iteration (ms): 14047.9 | learning rate: 9.182E-06 | global batch size: 16 | lm loss: 6.902629E+00 | loss scale: 16384.0 | grad norm: 105263.547 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2072/ 159576 | consumed samples: 33152 | elapsed time per iteration (ms): 13604.8 | learning rate: 9.186E-06 | global batch size: 16 | lm loss: 7.059223E+00 | loss scale: 16384.0 | grad norm: 156071.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2073/ 159576 | consumed samples: 33168 | elapsed time per iteration (ms): 13509.3 | learning rate: 9.191E-06 | global batch size: 16 | lm loss: 6.858756E+00 | loss scale: 16384.0 | grad norm: 183069.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2074/ 159576 | consumed samples: 33184 | elapsed time per iteration (ms): 13577.0 | learning rate: 9.195E-06 | global batch size: 16 | lm loss: 7.137619E+00 | loss scale: 16384.0 | grad norm: 165868.654 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2075/ 159576 | consumed samples: 33200 | elapsed time per iteration (ms): 13598.1 | learning rate: 9.200E-06 | global batch size: 16 | lm loss: 7.105383E+00 | loss scale: 16384.0 | grad norm: 81641.263 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2076/ 159576 | consumed samples: 33216 | elapsed time per iteration (ms): 13844.7 | learning rate: 9.204E-06 | global batch size: 16 | lm loss: 6.954556E+00 | loss scale: 16384.0 | grad norm: 90347.722 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2077/ 159576 | consumed samples: 33232 | elapsed time per iteration (ms): 13642.3 | learning rate: 9.209E-06 | global batch size: 16 | lm loss: 6.986308E+00 | loss scale: 16384.0 | grad norm: 71161.614 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2078/ 159576 | consumed samples: 33248 | elapsed time per iteration (ms): 13714.7 | learning rate: 9.213E-06 | global batch size: 16 | lm loss: 7.186345E+00 | loss scale: 16384.0 | grad norm: 125006.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2079/ 159576 | consumed samples: 33264 | elapsed time per iteration (ms): 13724.6 | learning rate: 9.217E-06 | global batch size: 16 | lm loss: 7.046529E+00 | loss scale: 16384.0 | grad norm: 72474.668 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2080/ 159576 | consumed samples: 33280 | elapsed time per iteration (ms): 13823.6 | learning rate: 9.222E-06 | global batch size: 16 | lm loss: 6.926587E+00 | loss scale: 16384.0 | grad norm: 72628.016 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2081/ 159576 | consumed samples: 33296 | elapsed time per iteration (ms): 13659.2 | learning rate: 9.226E-06 | global batch size: 16 | lm loss: 6.850713E+00 | loss scale: 16384.0 | grad norm: 78040.610 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2082/ 159576 | consumed samples: 33312 | elapsed time per iteration (ms): 13653.7 | learning rate: 9.231E-06 | global batch size: 16 | lm loss: 7.014567E+00 | loss scale: 16384.0 | grad norm: 88063.955 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2083/ 159576 | consumed samples: 33328 | elapsed time per iteration (ms): 13690.1 | learning rate: 9.235E-06 | global batch size: 16 | lm loss: 6.964838E+00 | loss scale: 16384.0 | grad norm: 68577.460 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2084/ 159576 | consumed samples: 33344 | elapsed time per iteration (ms): 14064.9 | learning rate: 9.240E-06 | global batch size: 16 | lm loss: 6.954602E+00 | loss scale: 16384.0 | grad norm: 70285.947 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2085/ 159576 | consumed samples: 33360 | elapsed time per iteration (ms): 13835.0 | learning rate: 9.244E-06 | global batch size: 16 | lm loss: 6.952052E+00 | loss scale: 16384.0 | grad norm: 85673.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2086/ 159576 | consumed samples: 33376 | elapsed time per iteration (ms): 13813.8 | learning rate: 9.249E-06 | global batch size: 16 | lm loss: 6.909387E+00 | loss scale: 16384.0 | grad norm: 118966.507 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2087/ 159576 | consumed samples: 33392 | elapsed time per iteration (ms): 13678.6 | learning rate: 9.253E-06 | global batch size: 16 | lm loss: 6.961540E+00 | loss scale: 16384.0 | grad norm: 66329.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2088/ 159576 | consumed samples: 33408 | elapsed time per iteration (ms): 13699.4 | learning rate: 9.257E-06 | global batch size: 16 | lm loss: 7.038545E+00 | loss scale: 16384.0 | grad norm: 77147.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2089/ 159576 | consumed samples: 33424 | elapsed time per iteration (ms): 13870.3 | learning rate: 9.262E-06 | global batch size: 16 | lm loss: 6.829208E+00 | loss scale: 16384.0 | grad norm: 66850.604 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2090/ 159576 | consumed samples: 33440 | elapsed time per iteration (ms): 13553.2 | learning rate: 9.266E-06 | global batch size: 16 | lm loss: 6.885040E+00 | loss scale: 16384.0 | grad norm: 63418.965 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2091/ 159576 | consumed samples: 33456 | elapsed time per iteration (ms): 13563.4 | learning rate: 9.271E-06 | global batch size: 16 | lm loss: 7.227287E+00 | loss scale: 16384.0 | grad norm: 99229.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2092/ 159576 | consumed samples: 33472 | elapsed time per iteration (ms): 13616.1 | learning rate: 9.275E-06 | global batch size: 16 | lm loss: 7.151490E+00 | loss scale: 16384.0 | grad norm: 77793.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2093/ 159576 | consumed samples: 33488 | elapsed time per iteration (ms): 14020.5 | learning rate: 9.280E-06 | global batch size: 16 | lm loss: 6.956719E+00 | loss scale: 16384.0 | grad norm: 71078.394 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2094/ 159576 | consumed samples: 33504 | elapsed time per iteration (ms): 13583.2 | learning rate: 9.284E-06 | global batch size: 16 | lm loss: 6.863022E+00 | loss scale: 16384.0 | grad norm: 75874.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2095/ 159576 | consumed samples: 33520 | elapsed time per iteration (ms): 13540.7 | learning rate: 9.288E-06 | global batch size: 16 | lm loss: 7.230942E+00 | loss scale: 16384.0 | grad norm: 66376.740 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2096/ 159576 | consumed samples: 33536 | elapsed time per iteration (ms): 13617.6 | learning rate: 9.293E-06 | global batch size: 16 | lm loss: 6.938297E+00 | loss scale: 16384.0 | grad norm: 80597.438 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2097/ 159576 | consumed samples: 33552 | elapsed time per iteration (ms): 13611.2 | learning rate: 9.297E-06 | global batch size: 16 | lm loss: 6.750860E+00 | loss scale: 16384.0 | grad norm: 50768.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2098/ 159576 | consumed samples: 33568 | elapsed time per iteration (ms): 13781.0 | learning rate: 9.302E-06 | global batch size: 16 | lm loss: 6.866726E+00 | loss scale: 16384.0 | grad norm: 120258.979 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2099/ 159576 | consumed samples: 33584 | elapsed time per iteration (ms): 13657.4 | learning rate: 9.306E-06 | global batch size: 16 | lm loss: 6.825637E+00 | loss scale: 16384.0 | grad norm: 95301.455 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2100/ 159576 | consumed samples: 33600 | elapsed time per iteration (ms): 13666.9 | learning rate: 9.311E-06 | global batch size: 16 | lm loss: 6.864701E+00 | loss scale: 16384.0 | grad norm: 68908.392 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2101/ 159576 | consumed samples: 33616 | elapsed time per iteration (ms): 13629.3 | learning rate: 9.315E-06 | global batch size: 16 | lm loss: 6.992301E+00 | loss scale: 16384.0 | grad norm: 74768.073 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2102/ 159576 | consumed samples: 33632 | elapsed time per iteration (ms): 14067.7 | learning rate: 9.320E-06 | global batch size: 16 | lm loss: 7.044778E+00 | loss scale: 16384.0 | grad norm: 118054.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2103/ 159576 | consumed samples: 33648 | elapsed time per iteration (ms): 13615.1 | learning rate: 9.324E-06 | global batch size: 16 | lm loss: 7.033617E+00 | loss scale: 16384.0 | grad norm: 69826.634 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2104/ 159576 | consumed samples: 33664 | elapsed time per iteration (ms): 13577.5 | learning rate: 9.328E-06 | global batch size: 16 | lm loss: 6.970243E+00 | loss scale: 16384.0 | grad norm: 88873.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2105/ 159576 | consumed samples: 33680 | elapsed time per iteration (ms): 13581.9 | learning rate: 9.333E-06 | global batch size: 16 | lm loss: 6.917067E+00 | loss scale: 16384.0 | grad norm: 93657.084 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2106/ 159576 | consumed samples: 33696 | elapsed time per iteration (ms): 14007.1 | learning rate: 9.337E-06 | global batch size: 16 | lm loss: 7.027580E+00 | loss scale: 16384.0 | grad norm: 62511.740 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2107/ 159576 | consumed samples: 33712 | elapsed time per iteration (ms): 13598.0 | learning rate: 9.342E-06 | global batch size: 16 | lm loss: 7.132909E+00 | loss scale: 16384.0 | grad norm: 177960.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2108/ 159576 | consumed samples: 33728 | elapsed time per iteration (ms): 13635.0 | learning rate: 9.346E-06 | global batch size: 16 | lm loss: 7.048873E+00 | loss scale: 16384.0 | grad norm: 122116.263 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2109/ 159576 | consumed samples: 33744 | elapsed time per iteration (ms): 13663.3 | learning rate: 9.351E-06 | global batch size: 16 | lm loss: 6.996678E+00 | loss scale: 16384.0 | grad norm: 85763.068 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2110/ 159576 | consumed samples: 33760 | elapsed time per iteration (ms): 13680.8 | learning rate: 9.355E-06 | global batch size: 16 | lm loss: 6.889836E+00 | loss scale: 16384.0 | grad norm: 84089.334 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2111/ 159576 | consumed samples: 33776 | elapsed time per iteration (ms): 13628.5 | learning rate: 9.359E-06 | global batch size: 16 | lm loss: 6.968468E+00 | loss scale: 16384.0 | grad norm: 51256.696 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2112/ 159576 | consumed samples: 33792 | elapsed time per iteration (ms): 13610.9 | learning rate: 9.364E-06 | global batch size: 16 | lm loss: 6.917239E+00 | loss scale: 16384.0 | grad norm: 126008.694 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2113/ 159576 | consumed samples: 33808 | elapsed time per iteration (ms): 13593.1 | learning rate: 9.368E-06 | global batch size: 16 | lm loss: 6.871556E+00 | loss scale: 16384.0 | grad norm: 67758.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2114/ 159576 | consumed samples: 33824 | elapsed time per iteration (ms): 13663.1 | learning rate: 9.373E-06 | global batch size: 16 | lm loss: 6.927833E+00 | loss scale: 16384.0 | grad norm: 85851.310 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2115/ 159576 | consumed samples: 33840 | elapsed time per iteration (ms): 13986.1 | learning rate: 9.377E-06 | global batch size: 16 | lm loss: 6.965062E+00 | loss scale: 16384.0 | grad norm: 65169.232 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2116/ 159576 | consumed samples: 33856 | elapsed time per iteration (ms): 13585.2 | learning rate: 9.382E-06 | global batch size: 16 | lm loss: 7.081017E+00 | loss scale: 16384.0 | grad norm: 73782.925 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2117/ 159576 | consumed samples: 33872 | elapsed time per iteration (ms): 13717.9 | learning rate: 9.386E-06 | global batch size: 16 | lm loss: 7.005242E+00 | loss scale: 16384.0 | grad norm: 125037.412 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2118/ 159576 | consumed samples: 33888 | elapsed time per iteration (ms): 13567.3 | learning rate: 9.391E-06 | global batch size: 16 | lm loss: 6.785961E+00 | loss scale: 16384.0 | grad norm: 74382.903 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2119/ 159576 | consumed samples: 33904 | elapsed time per iteration (ms): 13839.4 | learning rate: 9.395E-06 | global batch size: 16 | lm loss: 7.037541E+00 | loss scale: 16384.0 | grad norm: 61070.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2120/ 159576 | consumed samples: 33920 | elapsed time per iteration (ms): 13840.1 | learning rate: 9.399E-06 | global batch size: 16 | lm loss: 6.688106E+00 | loss scale: 16384.0 | grad norm: 77514.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2121/ 159576 | consumed samples: 33936 | elapsed time per iteration (ms): 13591.3 | learning rate: 9.404E-06 | global batch size: 16 | lm loss: 6.965182E+00 | loss scale: 16384.0 | grad norm: 85559.683 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2122/ 159576 | consumed samples: 33952 | elapsed time per iteration (ms): 13658.1 | learning rate: 9.408E-06 | global batch size: 16 | lm loss: 6.891047E+00 | loss scale: 16384.0 | grad norm: 84454.855 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2123/ 159576 | consumed samples: 33968 | elapsed time per iteration (ms): 13650.8 | learning rate: 9.413E-06 | global batch size: 16 | lm loss: 6.784370E+00 | loss scale: 16384.0 | grad norm: 74803.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2124/ 159576 | consumed samples: 33984 | elapsed time per iteration (ms): 13935.2 | learning rate: 9.417E-06 | global batch size: 16 | lm loss: 6.885671E+00 | loss scale: 16384.0 | grad norm: 68340.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2125/ 159576 | consumed samples: 34000 | elapsed time per iteration (ms): 13650.4 | learning rate: 9.422E-06 | global batch size: 16 | lm loss: 7.116186E+00 | loss scale: 16384.0 | grad norm: 75719.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2126/ 159576 | consumed samples: 34016 | elapsed time per iteration (ms): 13617.2 | learning rate: 9.426E-06 | global batch size: 16 | lm loss: 6.759393E+00 | loss scale: 16384.0 | grad norm: 57051.263 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2127/ 159576 | consumed samples: 34032 | elapsed time per iteration (ms): 13606.4 | learning rate: 9.430E-06 | global batch size: 16 | lm loss: 6.895882E+00 | loss scale: 16384.0 | grad norm: 117422.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2128/ 159576 | consumed samples: 34048 | elapsed time per iteration (ms): 13879.5 | learning rate: 9.435E-06 | global batch size: 16 | lm loss: 6.990780E+00 | loss scale: 16384.0 | grad norm: 47327.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2129/ 159576 | consumed samples: 34064 | elapsed time per iteration (ms): 13685.2 | learning rate: 9.439E-06 | global batch size: 16 | lm loss: 6.883922E+00 | loss scale: 16384.0 | grad norm: 75631.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2130/ 159576 | consumed samples: 34080 | elapsed time per iteration (ms): 13677.5 | learning rate: 9.444E-06 | global batch size: 16 | lm loss: 6.880146E+00 | loss scale: 16384.0 | grad norm: 70634.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2131/ 159576 | consumed samples: 34096 | elapsed time per iteration (ms): 13735.8 | learning rate: 9.448E-06 | global batch size: 16 | lm loss: 6.800762E+00 | loss scale: 16384.0 | grad norm: 114482.498 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2132/ 159576 | consumed samples: 34112 | elapsed time per iteration (ms): 13614.4 | learning rate: 9.453E-06 | global batch size: 16 | lm loss: 7.057775E+00 | loss scale: 16384.0 | grad norm: 131631.194 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2133/ 159576 | consumed samples: 34128 | elapsed time per iteration (ms): 13899.1 | learning rate: 9.457E-06 | global batch size: 16 | lm loss: 7.006071E+00 | loss scale: 16384.0 | grad norm: 88510.853 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2134/ 159576 | consumed samples: 34144 | elapsed time per iteration (ms): 13637.7 | learning rate: 9.462E-06 | global batch size: 16 | lm loss: 7.062113E+00 | loss scale: 16384.0 | grad norm: 75449.578 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2135/ 159576 | consumed samples: 34160 | elapsed time per iteration (ms): 13602.2 | learning rate: 9.466E-06 | global batch size: 16 | lm loss: 7.078564E+00 | loss scale: 16384.0 | grad norm: 130110.649 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2136/ 159576 | consumed samples: 34176 | elapsed time per iteration (ms): 13592.0 | learning rate: 9.470E-06 | global batch size: 16 | lm loss: 6.814717E+00 | loss scale: 16384.0 | grad norm: 149407.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2137/ 159576 | consumed samples: 34192 | elapsed time per iteration (ms): 14082.9 | learning rate: 9.475E-06 | global batch size: 16 | lm loss: 6.978102E+00 | loss scale: 16384.0 | grad norm: 53919.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2138/ 159576 | consumed samples: 34208 | elapsed time per iteration (ms): 13782.2 | learning rate: 9.479E-06 | global batch size: 16 | lm loss: 6.799563E+00 | loss scale: 16384.0 | grad norm: 71961.337 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2139/ 159576 | consumed samples: 34224 | elapsed time per iteration (ms): 13617.0 | learning rate: 9.484E-06 | global batch size: 16 | lm loss: 6.855867E+00 | loss scale: 16384.0 | grad norm: 59818.367 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2140/ 159576 | consumed samples: 34240 | elapsed time per iteration (ms): 13639.2 | learning rate: 9.488E-06 | global batch size: 16 | lm loss: 6.902345E+00 | loss scale: 16384.0 | grad norm: 58890.649 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2141/ 159576 | consumed samples: 34256 | elapsed time per iteration (ms): 13987.1 | learning rate: 9.493E-06 | global batch size: 16 | lm loss: 6.755795E+00 | loss scale: 16384.0 | grad norm: 77002.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2142/ 159576 | consumed samples: 34272 | elapsed time per iteration (ms): 13630.0 | learning rate: 9.497E-06 | global batch size: 16 | lm loss: 6.875304E+00 | loss scale: 16384.0 | grad norm: 67923.163 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2143/ 159576 | consumed samples: 34288 | elapsed time per iteration (ms): 13550.6 | learning rate: 9.501E-06 | global batch size: 16 | lm loss: 6.950579E+00 | loss scale: 16384.0 | grad norm: 177721.329 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2144/ 159576 | consumed samples: 34304 | elapsed time per iteration (ms): 13618.0 | learning rate: 9.506E-06 | global batch size: 16 | lm loss: 6.968021E+00 | loss scale: 16384.0 | grad norm: 116784.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2145/ 159576 | consumed samples: 34320 | elapsed time per iteration (ms): 13676.0 | learning rate: 9.510E-06 | global batch size: 16 | lm loss: 6.878886E+00 | loss scale: 16384.0 | grad norm: 69612.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2146/ 159576 | consumed samples: 34336 | elapsed time per iteration (ms): 13771.3 | learning rate: 9.515E-06 | global batch size: 16 | lm loss: 6.903853E+00 | loss scale: 16384.0 | grad norm: 80623.990 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2147/ 159576 | consumed samples: 34352 | elapsed time per iteration (ms): 13687.5 | learning rate: 9.519E-06 | global batch size: 16 | lm loss: 6.992352E+00 | loss scale: 16384.0 | grad norm: 50990.170 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2148/ 159576 | consumed samples: 34368 | elapsed time per iteration (ms): 13681.5 | learning rate: 9.524E-06 | global batch size: 16 | lm loss: 6.979048E+00 | loss scale: 16384.0 | grad norm: 120685.818 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2149/ 159576 | consumed samples: 34384 | elapsed time per iteration (ms): 13585.6 | learning rate: 9.528E-06 | global batch size: 16 | lm loss: 6.962264E+00 | loss scale: 16384.0 | grad norm: 95096.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2150/ 159576 | consumed samples: 34400 | elapsed time per iteration (ms): 13964.4 | learning rate: 9.533E-06 | global batch size: 16 | lm loss: 7.070148E+00 | loss scale: 16384.0 | grad norm: 102834.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2151/ 159576 | consumed samples: 34416 | elapsed time per iteration (ms): 13597.2 | learning rate: 9.537E-06 | global batch size: 16 | lm loss: 6.998973E+00 | loss scale: 16384.0 | grad norm: 66036.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2152/ 159576 | consumed samples: 34432 | elapsed time per iteration (ms): 13608.8 | learning rate: 9.541E-06 | global batch size: 16 | lm loss: 6.972906E+00 | loss scale: 16384.0 | grad norm: 85292.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2153/ 159576 | consumed samples: 34448 | elapsed time per iteration (ms): 13623.2 | learning rate: 9.546E-06 | global batch size: 16 | lm loss: 6.755056E+00 | loss scale: 16384.0 | grad norm: 76762.492 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2154/ 159576 | consumed samples: 34464 | elapsed time per iteration (ms): 13956.2 | learning rate: 9.550E-06 | global batch size: 16 | lm loss: 7.015395E+00 | loss scale: 16384.0 | grad norm: 90062.733 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2155/ 159576 | consumed samples: 34480 | elapsed time per iteration (ms): 13759.1 | learning rate: 9.555E-06 | global batch size: 16 | lm loss: 6.815333E+00 | loss scale: 16384.0 | grad norm: 68441.221 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2156/ 159576 | consumed samples: 34496 | elapsed time per iteration (ms): 13580.0 | learning rate: 9.559E-06 | global batch size: 16 | lm loss: 6.783628E+00 | loss scale: 16384.0 | grad norm: 110716.577 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2157/ 159576 | consumed samples: 34512 | elapsed time per iteration (ms): 13582.3 | learning rate: 9.564E-06 | global batch size: 16 | lm loss: 7.064082E+00 | loss scale: 16384.0 | grad norm: 62285.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2158/ 159576 | consumed samples: 34528 | elapsed time per iteration (ms): 13596.2 | learning rate: 9.568E-06 | global batch size: 16 | lm loss: 7.092577E+00 | loss scale: 16384.0 | grad norm: 69925.096 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2159/ 159576 | consumed samples: 34544 | elapsed time per iteration (ms): 13966.6 | learning rate: 9.572E-06 | global batch size: 16 | lm loss: 7.030209E+00 | loss scale: 16384.0 | grad norm: 74908.048 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2160/ 159576 | consumed samples: 34560 | elapsed time per iteration (ms): 13608.2 | learning rate: 9.577E-06 | global batch size: 16 | lm loss: 6.985407E+00 | loss scale: 16384.0 | grad norm: 107105.025 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2161/ 159576 | consumed samples: 34576 | elapsed time per iteration (ms): 13591.8 | learning rate: 9.581E-06 | global batch size: 16 | lm loss: 6.846824E+00 | loss scale: 16384.0 | grad norm: 59511.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2162/ 159576 | consumed samples: 34592 | elapsed time per iteration (ms): 13686.7 | learning rate: 9.586E-06 | global batch size: 16 | lm loss: 6.984041E+00 | loss scale: 16384.0 | grad norm: 81334.026 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2163/ 159576 | consumed samples: 34608 | elapsed time per iteration (ms): 13937.5 | learning rate: 9.590E-06 | global batch size: 16 | lm loss: 7.022871E+00 | loss scale: 16384.0 | grad norm: 84185.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2164/ 159576 | consumed samples: 34624 | elapsed time per iteration (ms): 13577.7 | learning rate: 9.595E-06 | global batch size: 16 | lm loss: 7.029066E+00 | loss scale: 16384.0 | grad norm: 47624.311 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2165/ 159576 | consumed samples: 34640 | elapsed time per iteration (ms): 13595.6 | learning rate: 9.599E-06 | global batch size: 16 | lm loss: 6.822045E+00 | loss scale: 16384.0 | grad norm: 138589.166 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2166/ 159576 | consumed samples: 34656 | elapsed time per iteration (ms): 13704.6 | learning rate: 9.604E-06 | global batch size: 16 | lm loss: 6.980874E+00 | loss scale: 16384.0 | grad norm: 80500.034 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2167/ 159576 | consumed samples: 34672 | elapsed time per iteration (ms): 13517.8 | learning rate: 9.608E-06 | global batch size: 16 | lm loss: 7.052095E+00 | loss scale: 16384.0 | grad norm: 68630.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2168/ 159576 | consumed samples: 34688 | elapsed time per iteration (ms): 13832.6 | learning rate: 9.612E-06 | global batch size: 16 | lm loss: 7.172165E+00 | loss scale: 16384.0 | grad norm: 59001.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2169/ 159576 | consumed samples: 34704 | elapsed time per iteration (ms): 13681.3 | learning rate: 9.617E-06 | global batch size: 16 | lm loss: 7.068394E+00 | loss scale: 16384.0 | grad norm: 73598.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2170/ 159576 | consumed samples: 34720 | elapsed time per iteration (ms): 13669.0 | learning rate: 9.621E-06 | global batch size: 16 | lm loss: 6.842896E+00 | loss scale: 16384.0 | grad norm: 62440.681 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2171/ 159576 | consumed samples: 34736 | elapsed time per iteration (ms): 13648.5 | learning rate: 9.626E-06 | global batch size: 16 | lm loss: 7.126867E+00 | loss scale: 16384.0 | grad norm: 155364.353 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2172/ 159576 | consumed samples: 34752 | elapsed time per iteration (ms): 14078.1 | learning rate: 9.630E-06 | global batch size: 16 | lm loss: 7.047744E+00 | loss scale: 16384.0 | grad norm: 113473.385 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2173/ 159576 | consumed samples: 34768 | elapsed time per iteration (ms): 13680.5 | learning rate: 9.635E-06 | global batch size: 16 | lm loss: 7.016094E+00 | loss scale: 16384.0 | grad norm: 73489.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2174/ 159576 | consumed samples: 34784 | elapsed time per iteration (ms): 13666.0 | learning rate: 9.639E-06 | global batch size: 16 | lm loss: 7.061403E+00 | loss scale: 16384.0 | grad norm: 75521.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2175/ 159576 | consumed samples: 34800 | elapsed time per iteration (ms): 13610.4 | learning rate: 9.643E-06 | global batch size: 16 | lm loss: 7.042882E+00 | loss scale: 16384.0 | grad norm: 95300.955 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2176/ 159576 | consumed samples: 34816 | elapsed time per iteration (ms): 14108.9 | learning rate: 9.648E-06 | global batch size: 16 | lm loss: 6.915576E+00 | loss scale: 16384.0 | grad norm: 74751.665 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2177/ 159576 | consumed samples: 34832 | elapsed time per iteration (ms): 13643.1 | learning rate: 9.652E-06 | global batch size: 16 | lm loss: 6.979721E+00 | loss scale: 16384.0 | grad norm: 71252.622 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2178/ 159576 | consumed samples: 34848 | elapsed time per iteration (ms): 13642.9 | learning rate: 9.657E-06 | global batch size: 16 | lm loss: 6.816618E+00 | loss scale: 16384.0 | grad norm: 60039.955 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2179/ 159576 | consumed samples: 34864 | elapsed time per iteration (ms): 13628.9 | learning rate: 9.661E-06 | global batch size: 16 | lm loss: 7.054741E+00 | loss scale: 16384.0 | grad norm: 196305.881 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2180/ 159576 | consumed samples: 34880 | elapsed time per iteration (ms): 13588.5 | learning rate: 9.666E-06 | global batch size: 16 | lm loss: 6.953914E+00 | loss scale: 16384.0 | grad norm: 120715.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2181/ 159576 | consumed samples: 34896 | elapsed time per iteration (ms): 13968.3 | learning rate: 9.670E-06 | global batch size: 16 | lm loss: 7.034101E+00 | loss scale: 16384.0 | grad norm: 81756.186 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2182/ 159576 | consumed samples: 34912 | elapsed time per iteration (ms): 13658.7 | learning rate: 9.675E-06 | global batch size: 16 | lm loss: 6.787637E+00 | loss scale: 16384.0 | grad norm: 99431.755 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2183/ 159576 | consumed samples: 34928 | elapsed time per iteration (ms): 13669.1 | learning rate: 9.679E-06 | global batch size: 16 | lm loss: 6.894065E+00 | loss scale: 16384.0 | grad norm: 83400.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2184/ 159576 | consumed samples: 34944 | elapsed time per iteration (ms): 13649.9 | learning rate: 9.683E-06 | global batch size: 16 | lm loss: 6.871455E+00 | loss scale: 16384.0 | grad norm: 159204.546 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2185/ 159576 | consumed samples: 34960 | elapsed time per iteration (ms): 14059.0 | learning rate: 9.688E-06 | global batch size: 16 | lm loss: 6.954823E+00 | loss scale: 16384.0 | grad norm: 106187.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2186/ 159576 | consumed samples: 34976 | elapsed time per iteration (ms): 13651.8 | learning rate: 9.692E-06 | global batch size: 16 | lm loss: 7.198211E+00 | loss scale: 16384.0 | grad norm: 95306.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2187/ 159576 | consumed samples: 34992 | elapsed time per iteration (ms): 13612.8 | learning rate: 9.697E-06 | global batch size: 16 | lm loss: 7.037758E+00 | loss scale: 16384.0 | grad norm: 86743.620 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2188/ 159576 | consumed samples: 35008 | elapsed time per iteration (ms): 13616.1 | learning rate: 9.701E-06 | global batch size: 16 | lm loss: 6.780216E+00 | loss scale: 16384.0 | grad norm: 66759.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2189/ 159576 | consumed samples: 35024 | elapsed time per iteration (ms): 13935.4 | learning rate: 9.706E-06 | global batch size: 16 | lm loss: 7.134370E+00 | loss scale: 16384.0 | grad norm: 224387.512 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2190/ 159576 | consumed samples: 35040 | elapsed time per iteration (ms): 13796.3 | learning rate: 9.710E-06 | global batch size: 16 | lm loss: 6.830962E+00 | loss scale: 16384.0 | grad norm: 184503.407 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2191/ 159576 | consumed samples: 35056 | elapsed time per iteration (ms): 13596.6 | learning rate: 9.714E-06 | global batch size: 16 | lm loss: 7.006136E+00 | loss scale: 16384.0 | grad norm: 105791.757 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2192/ 159576 | consumed samples: 35072 | elapsed time per iteration (ms): 13632.0 | learning rate: 9.719E-06 | global batch size: 16 | lm loss: 7.023957E+00 | loss scale: 16384.0 | grad norm: 128317.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2193/ 159576 | consumed samples: 35088 | elapsed time per iteration (ms): 13700.7 | learning rate: 9.723E-06 | global batch size: 16 | lm loss: 6.920637E+00 | loss scale: 16384.0 | grad norm: 90884.730 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2194/ 159576 | consumed samples: 35104 | elapsed time per iteration (ms): 13995.7 | learning rate: 9.728E-06 | global batch size: 16 | lm loss: 7.240769E+00 | loss scale: 16384.0 | grad norm: 157352.501 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2195/ 159576 | consumed samples: 35120 | elapsed time per iteration (ms): 13669.4 | learning rate: 9.732E-06 | global batch size: 16 | lm loss: 6.780205E+00 | loss scale: 16384.0 | grad norm: 106455.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2196/ 159576 | consumed samples: 35136 | elapsed time per iteration (ms): 13670.0 | learning rate: 9.737E-06 | global batch size: 16 | lm loss: 6.778285E+00 | loss scale: 16384.0 | grad norm: 86879.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2197/ 159576 | consumed samples: 35152 | elapsed time per iteration (ms): 13661.3 | learning rate: 9.741E-06 | global batch size: 16 | lm loss: 7.030122E+00 | loss scale: 16384.0 | grad norm: 93377.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2198/ 159576 | consumed samples: 35168 | elapsed time per iteration (ms): 13923.4 | learning rate: 9.746E-06 | global batch size: 16 | lm loss: 6.727036E+00 | loss scale: 16384.0 | grad norm: 148918.392 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2199/ 159576 | consumed samples: 35184 | elapsed time per iteration (ms): 13675.4 | learning rate: 9.750E-06 | global batch size: 16 | lm loss: 7.104040E+00 | loss scale: 16384.0 | grad norm: 135532.675 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2200/ 159576 | consumed samples: 35200 | elapsed time per iteration (ms): 13739.5 | learning rate: 9.754E-06 | global batch size: 16 | lm loss: 6.969880E+00 | loss scale: 16384.0 | grad norm: 96195.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2201/ 159576 | consumed samples: 35216 | elapsed time per iteration (ms): 13703.1 | learning rate: 9.759E-06 | global batch size: 16 | lm loss: 7.123239E+00 | loss scale: 16384.0 | grad norm: 89259.239 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2202/ 159576 | consumed samples: 35232 | elapsed time per iteration (ms): 13665.4 | learning rate: 9.763E-06 | global batch size: 16 | lm loss: 6.652438E+00 | loss scale: 16384.0 | grad norm: 70165.954 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2203/ 159576 | consumed samples: 35248 | elapsed time per iteration (ms): 13954.1 | learning rate: 9.768E-06 | global batch size: 16 | lm loss: 6.943371E+00 | loss scale: 16384.0 | grad norm: 138696.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2204/ 159576 | consumed samples: 35264 | elapsed time per iteration (ms): 13604.7 | learning rate: 9.772E-06 | global batch size: 16 | lm loss: 6.743501E+00 | loss scale: 16384.0 | grad norm: 190526.042 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2205/ 159576 | consumed samples: 35280 | elapsed time per iteration (ms): 13626.5 | learning rate: 9.777E-06 | global batch size: 16 | lm loss: 6.968715E+00 | loss scale: 16384.0 | grad norm: 97137.923 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2206/ 159576 | consumed samples: 35296 | elapsed time per iteration (ms): 13767.5 | learning rate: 9.781E-06 | global batch size: 16 | lm loss: 6.911567E+00 | loss scale: 16384.0 | grad norm: 68778.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2207/ 159576 | consumed samples: 35312 | elapsed time per iteration (ms): 14159.2 | learning rate: 9.786E-06 | global batch size: 16 | lm loss: 7.117369E+00 | loss scale: 16384.0 | grad norm: 70066.331 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2208/ 159576 | consumed samples: 35328 | elapsed time per iteration (ms): 13832.5 | learning rate: 9.790E-06 | global batch size: 16 | lm loss: 7.121370E+00 | loss scale: 16384.0 | grad norm: 98891.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2209/ 159576 | consumed samples: 35344 | elapsed time per iteration (ms): 13749.3 | learning rate: 9.794E-06 | global batch size: 16 | lm loss: 6.873634E+00 | loss scale: 16384.0 | grad norm: 61060.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2210/ 159576 | consumed samples: 35360 | elapsed time per iteration (ms): 13710.7 | learning rate: 9.799E-06 | global batch size: 16 | lm loss: 6.761906E+00 | loss scale: 16384.0 | grad norm: 87340.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2211/ 159576 | consumed samples: 35376 | elapsed time per iteration (ms): 14073.4 | learning rate: 9.803E-06 | global batch size: 16 | lm loss: 6.896225E+00 | loss scale: 16384.0 | grad norm: 67623.817 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2212/ 159576 | consumed samples: 35392 | elapsed time per iteration (ms): 13676.6 | learning rate: 9.808E-06 | global batch size: 16 | lm loss: 6.925282E+00 | loss scale: 16384.0 | grad norm: 112986.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2213/ 159576 | consumed samples: 35408 | elapsed time per iteration (ms): 13682.0 | learning rate: 9.812E-06 | global batch size: 16 | lm loss: 6.932837E+00 | loss scale: 16384.0 | grad norm: 72538.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2214/ 159576 | consumed samples: 35424 | elapsed time per iteration (ms): 13773.0 | learning rate: 9.817E-06 | global batch size: 16 | lm loss: 6.751261E+00 | loss scale: 16384.0 | grad norm: 110253.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2215/ 159576 | consumed samples: 35440 | elapsed time per iteration (ms): 13688.8 | learning rate: 9.821E-06 | global batch size: 16 | lm loss: 6.953260E+00 | loss scale: 16384.0 | grad norm: 85951.671 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2216/ 159576 | consumed samples: 35456 | elapsed time per iteration (ms): 13877.0 | learning rate: 9.825E-06 | global batch size: 16 | lm loss: 6.963014E+00 | loss scale: 16384.0 | grad norm: 78883.228 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2217/ 159576 | consumed samples: 35472 | elapsed time per iteration (ms): 13727.8 | learning rate: 9.830E-06 | global batch size: 16 | lm loss: 6.840832E+00 | loss scale: 16384.0 | grad norm: 92435.156 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2218/ 159576 | consumed samples: 35488 | elapsed time per iteration (ms): 13750.4 | learning rate: 9.834E-06 | global batch size: 16 | lm loss: 6.949021E+00 | loss scale: 16384.0 | grad norm: 60313.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2219/ 159576 | consumed samples: 35504 | elapsed time per iteration (ms): 13607.8 | learning rate: 9.839E-06 | global batch size: 16 | lm loss: 6.950431E+00 | loss scale: 16384.0 | grad norm: 92434.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2220/ 159576 | consumed samples: 35520 | elapsed time per iteration (ms): 14159.9 | learning rate: 9.843E-06 | global batch size: 16 | lm loss: 7.318023E+00 | loss scale: 16384.0 | grad norm: 75178.025 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2221/ 159576 | consumed samples: 35536 | elapsed time per iteration (ms): 13828.1 | learning rate: 9.848E-06 | global batch size: 16 | lm loss: 6.425551E+00 | loss scale: 16384.0 | grad norm: 66904.070 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2222/ 159576 | consumed samples: 35552 | elapsed time per iteration (ms): 13669.2 | learning rate: 9.852E-06 | global batch size: 16 | lm loss: 7.016433E+00 | loss scale: 16384.0 | grad norm: 48549.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2223/ 159576 | consumed samples: 35568 | elapsed time per iteration (ms): 13705.5 | learning rate: 9.857E-06 | global batch size: 16 | lm loss: 7.026052E+00 | loss scale: 16384.0 | grad norm: 87253.670 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2224/ 159576 | consumed samples: 35584 | elapsed time per iteration (ms): 14141.1 | learning rate: 9.861E-06 | global batch size: 16 | lm loss: 7.019730E+00 | loss scale: 16384.0 | grad norm: 75100.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2225/ 159576 | consumed samples: 35600 | elapsed time per iteration (ms): 13696.3 | learning rate: 9.865E-06 | global batch size: 16 | lm loss: 6.750052E+00 | loss scale: 16384.0 | grad norm: 72544.618 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2226/ 159576 | consumed samples: 35616 | elapsed time per iteration (ms): 13659.8 | learning rate: 9.870E-06 | global batch size: 16 | lm loss: 6.815751E+00 | loss scale: 16384.0 | grad norm: 76403.248 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2227/ 159576 | consumed samples: 35632 | elapsed time per iteration (ms): 13696.5 | learning rate: 9.874E-06 | global batch size: 16 | lm loss: 6.716208E+00 | loss scale: 16384.0 | grad norm: 70565.479 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2228/ 159576 | consumed samples: 35648 | elapsed time per iteration (ms): 13652.7 | learning rate: 9.879E-06 | global batch size: 16 | lm loss: 6.902302E+00 | loss scale: 16384.0 | grad norm: 99921.530 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2229/ 159576 | consumed samples: 35664 | elapsed time per iteration (ms): 13754.5 | learning rate: 9.883E-06 | global batch size: 16 | lm loss: 6.941592E+00 | loss scale: 16384.0 | grad norm: 77045.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2230/ 159576 | consumed samples: 35680 | elapsed time per iteration (ms): 13726.8 | learning rate: 9.888E-06 | global batch size: 16 | lm loss: 7.006780E+00 | loss scale: 16384.0 | grad norm: 79594.378 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2231/ 159576 | consumed samples: 35696 | elapsed time per iteration (ms): 13704.0 | learning rate: 9.892E-06 | global batch size: 16 | lm loss: 7.056840E+00 | loss scale: 16384.0 | grad norm: 72251.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2232/ 159576 | consumed samples: 35712 | elapsed time per iteration (ms): 13646.8 | learning rate: 9.896E-06 | global batch size: 16 | lm loss: 6.913527E+00 | loss scale: 16384.0 | grad norm: 58442.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2233/ 159576 | consumed samples: 35728 | elapsed time per iteration (ms): 14009.0 | learning rate: 9.901E-06 | global batch size: 16 | lm loss: 6.865626E+00 | loss scale: 16384.0 | grad norm: 73447.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2234/ 159576 | consumed samples: 35744 | elapsed time per iteration (ms): 13550.7 | learning rate: 9.905E-06 | global batch size: 16 | lm loss: 6.954779E+00 | loss scale: 16384.0 | grad norm: 63007.809 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2235/ 159576 | consumed samples: 35760 | elapsed time per iteration (ms): 13638.3 | learning rate: 9.910E-06 | global batch size: 16 | lm loss: 6.917772E+00 | loss scale: 16384.0 | grad norm: 73029.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2236/ 159576 | consumed samples: 35776 | elapsed time per iteration (ms): 13495.6 | learning rate: 9.914E-06 | global batch size: 16 | lm loss: 6.899360E+00 | loss scale: 16384.0 | grad norm: 58524.994 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2237/ 159576 | consumed samples: 35792 | elapsed time per iteration (ms): 13933.0 | learning rate: 9.919E-06 | global batch size: 16 | lm loss: 6.898277E+00 | loss scale: 16384.0 | grad norm: 89250.802 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2238/ 159576 | consumed samples: 35808 | elapsed time per iteration (ms): 13906.4 | learning rate: 9.923E-06 | global batch size: 16 | lm loss: 6.863415E+00 | loss scale: 16384.0 | grad norm: 57965.777 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2239/ 159576 | consumed samples: 35824 | elapsed time per iteration (ms): 13638.8 | learning rate: 9.928E-06 | global batch size: 16 | lm loss: 6.994671E+00 | loss scale: 16384.0 | grad norm: 102232.968 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2240/ 159576 | consumed samples: 35840 | elapsed time per iteration (ms): 13621.9 | learning rate: 9.932E-06 | global batch size: 16 | lm loss: 6.956360E+00 | loss scale: 16384.0 | grad norm: 69904.385 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2241/ 159576 | consumed samples: 35856 | elapsed time per iteration (ms): 13633.2 | learning rate: 9.936E-06 | global batch size: 16 | lm loss: 6.939447E+00 | loss scale: 16384.0 | grad norm: 95578.290 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2242/ 159576 | consumed samples: 35872 | elapsed time per iteration (ms): 13726.4 | learning rate: 9.941E-06 | global batch size: 16 | lm loss: 7.046509E+00 | loss scale: 16384.0 | grad norm: 82383.239 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2243/ 159576 | consumed samples: 35888 | elapsed time per iteration (ms): 13506.7 | learning rate: 9.945E-06 | global batch size: 16 | lm loss: 7.151508E+00 | loss scale: 16384.0 | grad norm: 98476.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2244/ 159576 | consumed samples: 35904 | elapsed time per iteration (ms): 13568.6 | learning rate: 9.950E-06 | global batch size: 16 | lm loss: 6.872870E+00 | loss scale: 16384.0 | grad norm: 74912.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2245/ 159576 | consumed samples: 35920 | elapsed time per iteration (ms): 13602.7 | learning rate: 9.954E-06 | global batch size: 16 | lm loss: 6.673596E+00 | loss scale: 16384.0 | grad norm: 76531.716 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2246/ 159576 | consumed samples: 35936 | elapsed time per iteration (ms): 14093.3 | learning rate: 9.959E-06 | global batch size: 16 | lm loss: 6.910951E+00 | loss scale: 16384.0 | grad norm: 90155.766 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2247/ 159576 | consumed samples: 35952 | elapsed time per iteration (ms): 13495.1 | learning rate: 9.963E-06 | global batch size: 16 | lm loss: 6.761725E+00 | loss scale: 16384.0 | grad norm: 71637.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2248/ 159576 | consumed samples: 35968 | elapsed time per iteration (ms): 13629.2 | learning rate: 9.967E-06 | global batch size: 16 | lm loss: 6.898269E+00 | loss scale: 16384.0 | grad norm: 99310.370 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2249/ 159576 | consumed samples: 35984 | elapsed time per iteration (ms): 13535.5 | learning rate: 9.972E-06 | global batch size: 16 | lm loss: 6.917497E+00 | loss scale: 16384.0 | grad norm: 74932.151 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2250/ 159576 | consumed samples: 36000 | elapsed time per iteration (ms): 13554.8 | learning rate: 9.976E-06 | global batch size: 16 | lm loss: 6.728826E+00 | loss scale: 16384.0 | grad norm: 73535.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2251/ 159576 | consumed samples: 36016 | elapsed time per iteration (ms): 13742.7 | learning rate: 9.981E-06 | global batch size: 16 | lm loss: 6.901268E+00 | loss scale: 16384.0 | grad norm: 76822.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2252/ 159576 | consumed samples: 36032 | elapsed time per iteration (ms): 13586.6 | learning rate: 9.985E-06 | global batch size: 16 | lm loss: 6.964120E+00 | loss scale: 16384.0 | grad norm: 47563.266 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2253/ 159576 | consumed samples: 36048 | elapsed time per iteration (ms): 13621.0 | learning rate: 9.990E-06 | global batch size: 16 | lm loss: 6.976019E+00 | loss scale: 16384.0 | grad norm: 84584.649 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2254/ 159576 | consumed samples: 36064 | elapsed time per iteration (ms): 13682.5 | learning rate: 9.994E-06 | global batch size: 16 | lm loss: 6.875343E+00 | loss scale: 16384.0 | grad norm: 37745.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2255/ 159576 | consumed samples: 36080 | elapsed time per iteration (ms): 14145.6 | learning rate: 9.999E-06 | global batch size: 16 | lm loss: 6.934249E+00 | loss scale: 16384.0 | grad norm: 136584.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2256/ 159576 | consumed samples: 36096 | elapsed time per iteration (ms): 13651.1 | learning rate: 1.000E-05 | global batch size: 16 | lm loss: 6.785090E+00 | loss scale: 16384.0 | grad norm: 79752.112 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2257/ 159576 | consumed samples: 36112 | elapsed time per iteration (ms): 13492.4 | learning rate: 1.001E-05 | global batch size: 16 | lm loss: 6.860191E+00 | loss scale: 16384.0 | grad norm: 66550.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2258/ 159576 | consumed samples: 36128 | elapsed time per iteration (ms): 13560.5 | learning rate: 1.001E-05 | global batch size: 16 | lm loss: 6.910413E+00 | loss scale: 16384.0 | grad norm: 67569.003 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2259/ 159576 | consumed samples: 36144 | elapsed time per iteration (ms): 14039.9 | learning rate: 1.002E-05 | global batch size: 16 | lm loss: 7.188947E+00 | loss scale: 16384.0 | grad norm: 73452.334 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2260/ 159576 | consumed samples: 36160 | elapsed time per iteration (ms): 13575.5 | learning rate: 1.002E-05 | global batch size: 16 | lm loss: 6.873131E+00 | loss scale: 16384.0 | grad norm: 111867.072 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2261/ 159576 | consumed samples: 36176 | elapsed time per iteration (ms): 13638.2 | learning rate: 1.003E-05 | global batch size: 16 | lm loss: 6.838548E+00 | loss scale: 16384.0 | grad norm: 80423.624 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2262/ 159576 | consumed samples: 36192 | elapsed time per iteration (ms): 13658.9 | learning rate: 1.003E-05 | global batch size: 16 | lm loss: 7.019104E+00 | loss scale: 16384.0 | grad norm: 84663.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2263/ 159576 | consumed samples: 36208 | elapsed time per iteration (ms): 13616.1 | learning rate: 1.003E-05 | global batch size: 16 | lm loss: 6.917726E+00 | loss scale: 16384.0 | grad norm: 79078.388 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2264/ 159576 | consumed samples: 36224 | elapsed time per iteration (ms): 13773.7 | learning rate: 1.004E-05 | global batch size: 16 | lm loss: 7.129383E+00 | loss scale: 16384.0 | grad norm: 84356.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2265/ 159576 | consumed samples: 36240 | elapsed time per iteration (ms): 13599.9 | learning rate: 1.004E-05 | global batch size: 16 | lm loss: 6.950484E+00 | loss scale: 16384.0 | grad norm: 96317.698 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2266/ 159576 | consumed samples: 36256 | elapsed time per iteration (ms): 13555.3 | learning rate: 1.005E-05 | global batch size: 16 | lm loss: 6.983542E+00 | loss scale: 16384.0 | grad norm: 87963.519 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2267/ 159576 | consumed samples: 36272 | elapsed time per iteration (ms): 13615.4 | learning rate: 1.005E-05 | global batch size: 16 | lm loss: 7.106489E+00 | loss scale: 16384.0 | grad norm: 49938.774 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2268/ 159576 | consumed samples: 36288 | elapsed time per iteration (ms): 13987.6 | learning rate: 1.006E-05 | global batch size: 16 | lm loss: 6.957284E+00 | loss scale: 16384.0 | grad norm: 80083.213 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2269/ 159576 | consumed samples: 36304 | elapsed time per iteration (ms): 13613.8 | learning rate: 1.006E-05 | global batch size: 16 | lm loss: 6.895617E+00 | loss scale: 16384.0 | grad norm: 89537.779 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2270/ 159576 | consumed samples: 36320 | elapsed time per iteration (ms): 13747.0 | learning rate: 1.007E-05 | global batch size: 16 | lm loss: 6.945907E+00 | loss scale: 16384.0 | grad norm: 109400.041 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2271/ 159576 | consumed samples: 36336 | elapsed time per iteration (ms): 13527.2 | learning rate: 1.007E-05 | global batch size: 16 | lm loss: 6.928704E+00 | loss scale: 16384.0 | grad norm: 78576.596 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2272/ 159576 | consumed samples: 36352 | elapsed time per iteration (ms): 13615.1 | learning rate: 1.007E-05 | global batch size: 16 | lm loss: 7.229642E+00 | loss scale: 16384.0 | grad norm: 80535.103 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2273/ 159576 | consumed samples: 36368 | elapsed time per iteration (ms): 13960.2 | learning rate: 1.008E-05 | global batch size: 16 | lm loss: 6.896622E+00 | loss scale: 16384.0 | grad norm: 65043.229 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2274/ 159576 | consumed samples: 36384 | elapsed time per iteration (ms): 13538.8 | learning rate: 1.008E-05 | global batch size: 16 | lm loss: 7.013526E+00 | loss scale: 16384.0 | grad norm: 78284.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2275/ 159576 | consumed samples: 36400 | elapsed time per iteration (ms): 13634.5 | learning rate: 1.009E-05 | global batch size: 16 | lm loss: 6.912004E+00 | loss scale: 16384.0 | grad norm: 66988.185 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2276/ 159576 | consumed samples: 36416 | elapsed time per iteration (ms): 13609.6 | learning rate: 1.009E-05 | global batch size: 16 | lm loss: 6.759723E+00 | loss scale: 16384.0 | grad norm: 69630.646 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2277/ 159576 | consumed samples: 36432 | elapsed time per iteration (ms): 14096.5 | learning rate: 1.010E-05 | global batch size: 16 | lm loss: 7.025202E+00 | loss scale: 16384.0 | grad norm: 66059.779 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2278/ 159576 | consumed samples: 36448 | elapsed time per iteration (ms): 13743.0 | learning rate: 1.010E-05 | global batch size: 16 | lm loss: 6.957587E+00 | loss scale: 16384.0 | grad norm: 80177.800 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2279/ 159576 | consumed samples: 36464 | elapsed time per iteration (ms): 13675.0 | learning rate: 1.011E-05 | global batch size: 16 | lm loss: 6.897773E+00 | loss scale: 16384.0 | grad norm: 50160.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2280/ 159576 | consumed samples: 36480 | elapsed time per iteration (ms): 13581.6 | learning rate: 1.011E-05 | global batch size: 16 | lm loss: 6.697253E+00 | loss scale: 16384.0 | grad norm: 64483.166 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2281/ 159576 | consumed samples: 36496 | elapsed time per iteration (ms): 13961.5 | learning rate: 1.011E-05 | global batch size: 16 | lm loss: 6.944922E+00 | loss scale: 16384.0 | grad norm: 67869.220 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2282/ 159576 | consumed samples: 36512 | elapsed time per iteration (ms): 13505.0 | learning rate: 1.012E-05 | global batch size: 16 | lm loss: 6.410736E+00 | loss scale: 16384.0 | grad norm: 49766.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2283/ 159576 | consumed samples: 36528 | elapsed time per iteration (ms): 13611.4 | learning rate: 1.012E-05 | global batch size: 16 | lm loss: 6.772882E+00 | loss scale: 16384.0 | grad norm: 59961.718 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2284/ 159576 | consumed samples: 36544 | elapsed time per iteration (ms): 13596.5 | learning rate: 1.013E-05 | global batch size: 16 | lm loss: 6.794603E+00 | loss scale: 16384.0 | grad norm: 68562.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2285/ 159576 | consumed samples: 36560 | elapsed time per iteration (ms): 13567.2 | learning rate: 1.013E-05 | global batch size: 16 | lm loss: 7.113194E+00 | loss scale: 16384.0 | grad norm: 59728.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2286/ 159576 | consumed samples: 36576 | elapsed time per iteration (ms): 13847.6 | learning rate: 1.014E-05 | global batch size: 16 | lm loss: 6.799785E+00 | loss scale: 16384.0 | grad norm: 76247.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2287/ 159576 | consumed samples: 36592 | elapsed time per iteration (ms): 13611.9 | learning rate: 1.014E-05 | global batch size: 16 | lm loss: 7.034187E+00 | loss scale: 16384.0 | grad norm: 50151.578 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2288/ 159576 | consumed samples: 36608 | elapsed time per iteration (ms): 13533.2 | learning rate: 1.014E-05 | global batch size: 16 | lm loss: 6.881348E+00 | loss scale: 16384.0 | grad norm: 130377.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2289/ 159576 | consumed samples: 36624 | elapsed time per iteration (ms): 13525.7 | learning rate: 1.015E-05 | global batch size: 16 | lm loss: 6.952589E+00 | loss scale: 16384.0 | grad norm: 68434.169 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2290/ 159576 | consumed samples: 36640 | elapsed time per iteration (ms): 13963.1 | learning rate: 1.015E-05 | global batch size: 16 | lm loss: 6.887176E+00 | loss scale: 16384.0 | grad norm: 89636.101 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2291/ 159576 | consumed samples: 36656 | elapsed time per iteration (ms): 13620.5 | learning rate: 1.016E-05 | global batch size: 16 | lm loss: 6.846462E+00 | loss scale: 16384.0 | grad norm: 73199.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2292/ 159576 | consumed samples: 36672 | elapsed time per iteration (ms): 13656.0 | learning rate: 1.016E-05 | global batch size: 16 | lm loss: 7.302676E+00 | loss scale: 16384.0 | grad norm: 174677.987 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2293/ 159576 | consumed samples: 36688 | elapsed time per iteration (ms): 13714.2 | learning rate: 1.017E-05 | global batch size: 16 | lm loss: 7.151010E+00 | loss scale: 16384.0 | grad norm: 135612.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2294/ 159576 | consumed samples: 36704 | elapsed time per iteration (ms): 13919.9 | learning rate: 1.017E-05 | global batch size: 16 | lm loss: 7.005547E+00 | loss scale: 16384.0 | grad norm: 89084.825 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2295/ 159576 | consumed samples: 36720 | elapsed time per iteration (ms): 13650.1 | learning rate: 1.018E-05 | global batch size: 16 | lm loss: 6.588016E+00 | loss scale: 16384.0 | grad norm: 102875.641 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2296/ 159576 | consumed samples: 36736 | elapsed time per iteration (ms): 13574.9 | learning rate: 1.018E-05 | global batch size: 16 | lm loss: 6.896825E+00 | loss scale: 16384.0 | grad norm: 70940.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2297/ 159576 | consumed samples: 36752 | elapsed time per iteration (ms): 13573.3 | learning rate: 1.018E-05 | global batch size: 16 | lm loss: 6.883708E+00 | loss scale: 16384.0 | grad norm: 146744.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2298/ 159576 | consumed samples: 36768 | elapsed time per iteration (ms): 13649.6 | learning rate: 1.019E-05 | global batch size: 16 | lm loss: 7.139965E+00 | loss scale: 16384.0 | grad norm: 75816.547 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2299/ 159576 | consumed samples: 36784 | elapsed time per iteration (ms): 13959.1 | learning rate: 1.019E-05 | global batch size: 16 | lm loss: 6.811082E+00 | loss scale: 16384.0 | grad norm: 83246.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2300/ 159576 | consumed samples: 36800 | elapsed time per iteration (ms): 13736.9 | learning rate: 1.020E-05 | global batch size: 16 | lm loss: 6.719008E+00 | loss scale: 16384.0 | grad norm: 93595.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2301/ 159576 | consumed samples: 36816 | elapsed time per iteration (ms): 13666.3 | learning rate: 1.020E-05 | global batch size: 16 | lm loss: 7.039846E+00 | loss scale: 16384.0 | grad norm: 58298.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2302/ 159576 | consumed samples: 36832 | elapsed time per iteration (ms): 13631.9 | learning rate: 1.021E-05 | global batch size: 16 | lm loss: 6.796918E+00 | loss scale: 16384.0 | grad norm: 153620.255 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2303/ 159576 | consumed samples: 36848 | elapsed time per iteration (ms): 13914.1 | learning rate: 1.021E-05 | global batch size: 16 | lm loss: 7.011253E+00 | loss scale: 16384.0 | grad norm: 79116.869 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2304/ 159576 | consumed samples: 36864 | elapsed time per iteration (ms): 13578.2 | learning rate: 1.022E-05 | global batch size: 16 | lm loss: 6.786969E+00 | loss scale: 16384.0 | grad norm: 78214.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2305/ 159576 | consumed samples: 36880 | elapsed time per iteration (ms): 13606.1 | learning rate: 1.022E-05 | global batch size: 16 | lm loss: 6.896228E+00 | loss scale: 16384.0 | grad norm: 59758.026 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2306/ 159576 | consumed samples: 36896 | elapsed time per iteration (ms): 13630.5 | learning rate: 1.022E-05 | global batch size: 16 | lm loss: 6.715625E+00 | loss scale: 16384.0 | grad norm: 82018.871 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 11:06:17] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 11:06:17] PULSE: tr8-104B is running for 5:14:06 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 2307/ 159576 | consumed samples: 36912 | elapsed time per iteration (ms): 13695.2 | learning rate: 1.023E-05 | global batch size: 16 | lm loss: 6.898945E+00 | loss scale: 16384.0 | grad norm: 69074.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2308/ 159576 | consumed samples: 36928 | elapsed time per iteration (ms): 13864.3 | learning rate: 1.023E-05 | global batch size: 16 | lm loss: 6.896221E+00 | loss scale: 16384.0 | grad norm: 86879.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2309/ 159576 | consumed samples: 36944 | elapsed time per iteration (ms): 13567.7 | learning rate: 1.024E-05 | global batch size: 16 | lm loss: 6.747959E+00 | loss scale: 16384.0 | grad norm: 77379.473 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2310/ 159576 | consumed samples: 36960 | elapsed time per iteration (ms): 13717.6 | learning rate: 1.024E-05 | global batch size: 16 | lm loss: 6.945070E+00 | loss scale: 16384.0 | grad norm: 55236.968 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2311/ 159576 | consumed samples: 36976 | elapsed time per iteration (ms): 13519.2 | learning rate: 1.025E-05 | global batch size: 16 | lm loss: 7.033360E+00 | loss scale: 16384.0 | grad norm: 184283.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2312/ 159576 | consumed samples: 36992 | elapsed time per iteration (ms): 14030.2 | learning rate: 1.025E-05 | global batch size: 16 | lm loss: 7.147439E+00 | loss scale: 16384.0 | grad norm: 152407.329 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2313/ 159576 | consumed samples: 37008 | elapsed time per iteration (ms): 13685.4 | learning rate: 1.026E-05 | global batch size: 16 | lm loss: 6.739760E+00 | loss scale: 16384.0 | grad norm: 71801.831 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2314/ 159576 | consumed samples: 37024 | elapsed time per iteration (ms): 13648.0 | learning rate: 1.026E-05 | global batch size: 16 | lm loss: 6.839672E+00 | loss scale: 16384.0 | grad norm: 112304.747 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2315/ 159576 | consumed samples: 37040 | elapsed time per iteration (ms): 13683.0 | learning rate: 1.026E-05 | global batch size: 16 | lm loss: 6.987888E+00 | loss scale: 16384.0 | grad norm: 97383.621 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2316/ 159576 | consumed samples: 37056 | elapsed time per iteration (ms): 14019.7 | learning rate: 1.027E-05 | global batch size: 16 | lm loss: 6.766959E+00 | loss scale: 16384.0 | grad norm: 70142.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2317/ 159576 | consumed samples: 37072 | elapsed time per iteration (ms): 13698.7 | learning rate: 1.027E-05 | global batch size: 16 | lm loss: 7.002495E+00 | loss scale: 16384.0 | grad norm: 94556.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2318/ 159576 | consumed samples: 37088 | elapsed time per iteration (ms): 13548.8 | learning rate: 1.028E-05 | global batch size: 16 | lm loss: 6.785909E+00 | loss scale: 16384.0 | grad norm: 84852.097 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2319/ 159576 | consumed samples: 37104 | elapsed time per iteration (ms): 13558.1 | learning rate: 1.028E-05 | global batch size: 16 | lm loss: 6.969275E+00 | loss scale: 16384.0 | grad norm: 88628.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2320/ 159576 | consumed samples: 37120 | elapsed time per iteration (ms): 13584.6 | learning rate: 1.029E-05 | global batch size: 16 | lm loss: 6.991512E+00 | loss scale: 16384.0 | grad norm: 73561.859 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2321/ 159576 | consumed samples: 37136 | elapsed time per iteration (ms): 13808.4 | learning rate: 1.029E-05 | global batch size: 16 | lm loss: 6.689001E+00 | loss scale: 16384.0 | grad norm: 79235.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2322/ 159576 | consumed samples: 37152 | elapsed time per iteration (ms): 13660.8 | learning rate: 1.030E-05 | global batch size: 16 | lm loss: 6.829502E+00 | loss scale: 16384.0 | grad norm: 69229.325 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2323/ 159576 | consumed samples: 37168 | elapsed time per iteration (ms): 13667.4 | learning rate: 1.030E-05 | global batch size: 16 | lm loss: 6.532575E+00 | loss scale: 16384.0 | grad norm: 55927.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2324/ 159576 | consumed samples: 37184 | elapsed time per iteration (ms): 13703.5 | learning rate: 1.030E-05 | global batch size: 16 | lm loss: 6.922344E+00 | loss scale: 16384.0 | grad norm: 55395.267 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2325/ 159576 | consumed samples: 37200 | elapsed time per iteration (ms): 14028.0 | learning rate: 1.031E-05 | global batch size: 16 | lm loss: 6.827266E+00 | loss scale: 16384.0 | grad norm: 53256.272 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2326/ 159576 | consumed samples: 37216 | elapsed time per iteration (ms): 13463.4 | learning rate: 1.031E-05 | global batch size: 16 | lm loss: 6.792019E+00 | loss scale: 16384.0 | grad norm: 61740.952 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2327/ 159576 | consumed samples: 37232 | elapsed time per iteration (ms): 13567.6 | learning rate: 1.032E-05 | global batch size: 16 | lm loss: 6.871485E+00 | loss scale: 16384.0 | grad norm: 65916.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2328/ 159576 | consumed samples: 37248 | elapsed time per iteration (ms): 13610.6 | learning rate: 1.032E-05 | global batch size: 16 | lm loss: 6.773655E+00 | loss scale: 16384.0 | grad norm: 55451.884 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2329/ 159576 | consumed samples: 37264 | elapsed time per iteration (ms): 13843.3 | learning rate: 1.033E-05 | global batch size: 16 | lm loss: 6.881806E+00 | loss scale: 16384.0 | grad norm: 68242.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2330/ 159576 | consumed samples: 37280 | elapsed time per iteration (ms): 13903.0 | learning rate: 1.033E-05 | global batch size: 16 | lm loss: 6.769863E+00 | loss scale: 16384.0 | grad norm: 54395.878 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2331/ 159576 | consumed samples: 37296 | elapsed time per iteration (ms): 13689.8 | learning rate: 1.034E-05 | global batch size: 16 | lm loss: 6.915558E+00 | loss scale: 16384.0 | grad norm: 69787.282 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2332/ 159576 | consumed samples: 37312 | elapsed time per iteration (ms): 13584.4 | learning rate: 1.034E-05 | global batch size: 16 | lm loss: 6.872691E+00 | loss scale: 16384.0 | grad norm: 53158.222 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2333/ 159576 | consumed samples: 37328 | elapsed time per iteration (ms): 13510.8 | learning rate: 1.034E-05 | global batch size: 16 | lm loss: 6.772065E+00 | loss scale: 16384.0 | grad norm: 62866.204 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2334/ 159576 | consumed samples: 37344 | elapsed time per iteration (ms): 13981.1 | learning rate: 1.035E-05 | global batch size: 16 | lm loss: 6.889673E+00 | loss scale: 16384.0 | grad norm: 79595.177 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2335/ 159576 | consumed samples: 37360 | elapsed time per iteration (ms): 13567.6 | learning rate: 1.035E-05 | global batch size: 16 | lm loss: 6.996318E+00 | loss scale: 16384.0 | grad norm: 47255.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2336/ 159576 | consumed samples: 37376 | elapsed time per iteration (ms): 13643.5 | learning rate: 1.036E-05 | global batch size: 16 | lm loss: 6.824782E+00 | loss scale: 16384.0 | grad norm: 152401.829 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2337/ 159576 | consumed samples: 37392 | elapsed time per iteration (ms): 13630.4 | learning rate: 1.036E-05 | global batch size: 16 | lm loss: 6.711504E+00 | loss scale: 16384.0 | grad norm: 73188.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2338/ 159576 | consumed samples: 37408 | elapsed time per iteration (ms): 14043.0 | learning rate: 1.037E-05 | global batch size: 16 | lm loss: 6.830018E+00 | loss scale: 16384.0 | grad norm: 92791.023 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2339/ 159576 | consumed samples: 37424 | elapsed time per iteration (ms): 13758.4 | learning rate: 1.037E-05 | global batch size: 16 | lm loss: 7.017688E+00 | loss scale: 16384.0 | grad norm: 87062.269 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2340/ 159576 | consumed samples: 37440 | elapsed time per iteration (ms): 13518.0 | learning rate: 1.038E-05 | global batch size: 16 | lm loss: 6.749167E+00 | loss scale: 16384.0 | grad norm: 72774.580 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2341/ 159576 | consumed samples: 37456 | elapsed time per iteration (ms): 13582.6 | learning rate: 1.038E-05 | global batch size: 16 | lm loss: 7.188419E+00 | loss scale: 16384.0 | grad norm: 400324.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2342/ 159576 | consumed samples: 37472 | elapsed time per iteration (ms): 13646.9 | learning rate: 1.038E-05 | global batch size: 16 | lm loss: 7.124457E+00 | loss scale: 16384.0 | grad norm: 441674.699 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2343/ 159576 | consumed samples: 37488 | elapsed time per iteration (ms): 13721.9 | learning rate: 1.039E-05 | global batch size: 16 | lm loss: 6.941244E+00 | loss scale: 16384.0 | grad norm: 218702.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2344/ 159576 | consumed samples: 37504 | elapsed time per iteration (ms): 13653.7 | learning rate: 1.039E-05 | global batch size: 16 | lm loss: 6.768173E+00 | loss scale: 16384.0 | grad norm: 93071.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2345/ 159576 | consumed samples: 37520 | elapsed time per iteration (ms): 13684.4 | learning rate: 1.040E-05 | global batch size: 16 | lm loss: 6.862311E+00 | loss scale: 16384.0 | grad norm: 105985.790 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2346/ 159576 | consumed samples: 37536 | elapsed time per iteration (ms): 13732.9 | learning rate: 1.040E-05 | global batch size: 16 | lm loss: 7.097474E+00 | loss scale: 16384.0 | grad norm: 93646.720 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2347/ 159576 | consumed samples: 37552 | elapsed time per iteration (ms): 14087.6 | learning rate: 1.041E-05 | global batch size: 16 | lm loss: 6.949347E+00 | loss scale: 16384.0 | grad norm: 169536.748 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2348/ 159576 | consumed samples: 37568 | elapsed time per iteration (ms): 13603.2 | learning rate: 1.041E-05 | global batch size: 16 | lm loss: 6.839984E+00 | loss scale: 16384.0 | grad norm: 221068.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2349/ 159576 | consumed samples: 37584 | elapsed time per iteration (ms): 13602.7 | learning rate: 1.042E-05 | global batch size: 16 | lm loss: 6.722544E+00 | loss scale: 16384.0 | grad norm: 90138.978 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2350/ 159576 | consumed samples: 37600 | elapsed time per iteration (ms): 13600.0 | learning rate: 1.042E-05 | global batch size: 16 | lm loss: 6.765959E+00 | loss scale: 16384.0 | grad norm: 87849.268 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2351/ 159576 | consumed samples: 37616 | elapsed time per iteration (ms): 14049.9 | learning rate: 1.042E-05 | global batch size: 16 | lm loss: 7.058582E+00 | loss scale: 16384.0 | grad norm: 97203.038 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2352/ 159576 | consumed samples: 37632 | elapsed time per iteration (ms): 13664.4 | learning rate: 1.043E-05 | global batch size: 16 | lm loss: 6.709276E+00 | loss scale: 16384.0 | grad norm: 64321.034 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2353/ 159576 | consumed samples: 37648 | elapsed time per iteration (ms): 13697.2 | learning rate: 1.043E-05 | global batch size: 16 | lm loss: 6.963477E+00 | loss scale: 16384.0 | grad norm: 219491.874 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2354/ 159576 | consumed samples: 37664 | elapsed time per iteration (ms): 13647.8 | learning rate: 1.044E-05 | global batch size: 16 | lm loss: 6.986011E+00 | loss scale: 16384.0 | grad norm: 159710.177 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2355/ 159576 | consumed samples: 37680 | elapsed time per iteration (ms): 13594.7 | learning rate: 1.044E-05 | global batch size: 16 | lm loss: 6.833197E+00 | loss scale: 16384.0 | grad norm: 97227.942 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2356/ 159576 | consumed samples: 37696 | elapsed time per iteration (ms): 13840.6 | learning rate: 1.045E-05 | global batch size: 16 | lm loss: 7.008437E+00 | loss scale: 16384.0 | grad norm: 89122.852 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2357/ 159576 | consumed samples: 37712 | elapsed time per iteration (ms): 13588.8 | learning rate: 1.045E-05 | global batch size: 16 | lm loss: 6.835823E+00 | loss scale: 16384.0 | grad norm: 77947.804 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2358/ 159576 | consumed samples: 37728 | elapsed time per iteration (ms): 13642.6 | learning rate: 1.046E-05 | global batch size: 16 | lm loss: 6.735652E+00 | loss scale: 16384.0 | grad norm: 162106.613 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2359/ 159576 | consumed samples: 37744 | elapsed time per iteration (ms): 13658.5 | learning rate: 1.046E-05 | global batch size: 16 | lm loss: 6.785017E+00 | loss scale: 16384.0 | grad norm: 128794.072 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2360/ 159576 | consumed samples: 37760 | elapsed time per iteration (ms): 14062.2 | learning rate: 1.046E-05 | global batch size: 16 | lm loss: 6.878942E+00 | loss scale: 16384.0 | grad norm: 101269.625 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2361/ 159576 | consumed samples: 37776 | elapsed time per iteration (ms): 13561.0 | learning rate: 1.047E-05 | global batch size: 16 | lm loss: 6.893463E+00 | loss scale: 16384.0 | grad norm: 78515.515 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2362/ 159576 | consumed samples: 37792 | elapsed time per iteration (ms): 13714.6 | learning rate: 1.047E-05 | global batch size: 16 | lm loss: 6.821845E+00 | loss scale: 16384.0 | grad norm: 78649.404 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2363/ 159576 | consumed samples: 37808 | elapsed time per iteration (ms): 13594.5 | learning rate: 1.048E-05 | global batch size: 16 | lm loss: 6.845947E+00 | loss scale: 16384.0 | grad norm: 158409.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2364/ 159576 | consumed samples: 37824 | elapsed time per iteration (ms): 13648.4 | learning rate: 1.048E-05 | global batch size: 16 | lm loss: 6.840971E+00 | loss scale: 16384.0 | grad norm: 88723.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2365/ 159576 | consumed samples: 37840 | elapsed time per iteration (ms): 13958.9 | learning rate: 1.049E-05 | global batch size: 16 | lm loss: 6.785653E+00 | loss scale: 16384.0 | grad norm: 106713.788 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2366/ 159576 | consumed samples: 37856 | elapsed time per iteration (ms): 13666.9 | learning rate: 1.049E-05 | global batch size: 16 | lm loss: 6.917600E+00 | loss scale: 16384.0 | grad norm: 90335.595 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2367/ 159576 | consumed samples: 37872 | elapsed time per iteration (ms): 13690.6 | learning rate: 1.050E-05 | global batch size: 16 | lm loss: 6.840955E+00 | loss scale: 16384.0 | grad norm: 63357.757 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2368/ 159576 | consumed samples: 37888 | elapsed time per iteration (ms): 13664.8 | learning rate: 1.050E-05 | global batch size: 16 | lm loss: 6.916069E+00 | loss scale: 16384.0 | grad norm: 107961.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2369/ 159576 | consumed samples: 37904 | elapsed time per iteration (ms): 14065.2 | learning rate: 1.050E-05 | global batch size: 16 | lm loss: 6.853414E+00 | loss scale: 16384.0 | grad norm: 84442.897 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2370/ 159576 | consumed samples: 37920 | elapsed time per iteration (ms): 13656.3 | learning rate: 1.051E-05 | global batch size: 16 | lm loss: 6.827930E+00 | loss scale: 16384.0 | grad norm: 62880.352 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2371/ 159576 | consumed samples: 37936 | elapsed time per iteration (ms): 13590.5 | learning rate: 1.051E-05 | global batch size: 16 | lm loss: 6.877656E+00 | loss scale: 16384.0 | grad norm: 75866.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2372/ 159576 | consumed samples: 37952 | elapsed time per iteration (ms): 13605.0 | learning rate: 1.052E-05 | global batch size: 16 | lm loss: 6.995963E+00 | loss scale: 16384.0 | grad norm: 71192.528 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2373/ 159576 | consumed samples: 37968 | elapsed time per iteration (ms): 13951.5 | learning rate: 1.052E-05 | global batch size: 16 | lm loss: 6.794531E+00 | loss scale: 16384.0 | grad norm: 64517.387 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2374/ 159576 | consumed samples: 37984 | elapsed time per iteration (ms): 13624.2 | learning rate: 1.053E-05 | global batch size: 16 | lm loss: 6.780855E+00 | loss scale: 16384.0 | grad norm: 83255.646 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2375/ 159576 | consumed samples: 38000 | elapsed time per iteration (ms): 13615.3 | learning rate: 1.053E-05 | global batch size: 16 | lm loss: 6.964709E+00 | loss scale: 16384.0 | grad norm: 79867.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2376/ 159576 | consumed samples: 38016 | elapsed time per iteration (ms): 13718.1 | learning rate: 1.054E-05 | global batch size: 16 | lm loss: 6.657259E+00 | loss scale: 16384.0 | grad norm: 60555.655 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2377/ 159576 | consumed samples: 38032 | elapsed time per iteration (ms): 13629.0 | learning rate: 1.054E-05 | global batch size: 16 | lm loss: 6.923594E+00 | loss scale: 16384.0 | grad norm: 52753.203 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2378/ 159576 | consumed samples: 38048 | elapsed time per iteration (ms): 13734.6 | learning rate: 1.054E-05 | global batch size: 16 | lm loss: 6.887539E+00 | loss scale: 16384.0 | grad norm: 103430.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2379/ 159576 | consumed samples: 38064 | elapsed time per iteration (ms): 13608.8 | learning rate: 1.055E-05 | global batch size: 16 | lm loss: 6.627044E+00 | loss scale: 16384.0 | grad norm: 73977.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2380/ 159576 | consumed samples: 38080 | elapsed time per iteration (ms): 13595.9 | learning rate: 1.055E-05 | global batch size: 16 | lm loss: 6.894679E+00 | loss scale: 16384.0 | grad norm: 66400.111 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2381/ 159576 | consumed samples: 38096 | elapsed time per iteration (ms): 13599.7 | learning rate: 1.056E-05 | global batch size: 16 | lm loss: 6.938529E+00 | loss scale: 16384.0 | grad norm: 70512.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2382/ 159576 | consumed samples: 38112 | elapsed time per iteration (ms): 14135.5 | learning rate: 1.056E-05 | global batch size: 16 | lm loss: 7.303653E+00 | loss scale: 16384.0 | grad norm: 79783.691 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2383/ 159576 | consumed samples: 38128 | elapsed time per iteration (ms): 13647.3 | learning rate: 1.057E-05 | global batch size: 16 | lm loss: 6.764983E+00 | loss scale: 16384.0 | grad norm: 74049.858 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2384/ 159576 | consumed samples: 38144 | elapsed time per iteration (ms): 13719.9 | learning rate: 1.057E-05 | global batch size: 16 | lm loss: 7.032783E+00 | loss scale: 16384.0 | grad norm: 66855.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2385/ 159576 | consumed samples: 38160 | elapsed time per iteration (ms): 13573.5 | learning rate: 1.058E-05 | global batch size: 16 | lm loss: 6.839710E+00 | loss scale: 16384.0 | grad norm: 58744.040 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2386/ 159576 | consumed samples: 38176 | elapsed time per iteration (ms): 14051.4 | learning rate: 1.058E-05 | global batch size: 16 | lm loss: 6.409803E+00 | loss scale: 16384.0 | grad norm: 54804.059 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2387/ 159576 | consumed samples: 38192 | elapsed time per iteration (ms): 13628.8 | learning rate: 1.058E-05 | global batch size: 16 | lm loss: 6.752995E+00 | loss scale: 16384.0 | grad norm: 57078.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2388/ 159576 | consumed samples: 38208 | elapsed time per iteration (ms): 13611.0 | learning rate: 1.059E-05 | global batch size: 16 | lm loss: 6.738320E+00 | loss scale: 16384.0 | grad norm: 45381.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2389/ 159576 | consumed samples: 38224 | elapsed time per iteration (ms): 13583.7 | learning rate: 1.059E-05 | global batch size: 16 | lm loss: 6.858883E+00 | loss scale: 16384.0 | grad norm: 86212.464 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2390/ 159576 | consumed samples: 38240 | elapsed time per iteration (ms): 13679.8 | learning rate: 1.060E-05 | global batch size: 16 | lm loss: 7.024375E+00 | loss scale: 16384.0 | grad norm: 66322.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2391/ 159576 | consumed samples: 38256 | elapsed time per iteration (ms): 13997.0 | learning rate: 1.060E-05 | global batch size: 16 | lm loss: 6.983364E+00 | loss scale: 16384.0 | grad norm: 84730.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2392/ 159576 | consumed samples: 38272 | elapsed time per iteration (ms): 13673.8 | learning rate: 1.061E-05 | global batch size: 16 | lm loss: 6.900928E+00 | loss scale: 16384.0 | grad norm: 52849.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2393/ 159576 | consumed samples: 38288 | elapsed time per iteration (ms): 13615.2 | learning rate: 1.061E-05 | global batch size: 16 | lm loss: 6.866693E+00 | loss scale: 16384.0 | grad norm: 87208.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2394/ 159576 | consumed samples: 38304 | elapsed time per iteration (ms): 13615.9 | learning rate: 1.062E-05 | global batch size: 16 | lm loss: 6.702727E+00 | loss scale: 16384.0 | grad norm: 69928.497 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2395/ 159576 | consumed samples: 38320 | elapsed time per iteration (ms): 14056.6 | learning rate: 1.062E-05 | global batch size: 16 | lm loss: 6.909261E+00 | loss scale: 16384.0 | grad norm: 122690.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2396/ 159576 | consumed samples: 38336 | elapsed time per iteration (ms): 13483.1 | learning rate: 1.062E-05 | global batch size: 16 | lm loss: 6.938586E+00 | loss scale: 16384.0 | grad norm: 80283.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2397/ 159576 | consumed samples: 38352 | elapsed time per iteration (ms): 13678.0 | learning rate: 1.063E-05 | global batch size: 16 | lm loss: 6.916673E+00 | loss scale: 16384.0 | grad norm: 78417.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2398/ 159576 | consumed samples: 38368 | elapsed time per iteration (ms): 13713.3 | learning rate: 1.063E-05 | global batch size: 16 | lm loss: 6.894761E+00 | loss scale: 16384.0 | grad norm: 79613.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2399/ 159576 | consumed samples: 38384 | elapsed time per iteration (ms): 13844.0 | learning rate: 1.064E-05 | global batch size: 16 | lm loss: 6.895288E+00 | loss scale: 16384.0 | grad norm: 117360.649 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2400/ 159576 | consumed samples: 38400 | elapsed time per iteration (ms): 13869.8 | learning rate: 1.064E-05 | global batch size: 16 | lm loss: 7.002610E+00 | loss scale: 16384.0 | grad norm: 98958.976 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2401/ 159576 | consumed samples: 38416 | elapsed time per iteration (ms): 13601.8 | learning rate: 1.065E-05 | global batch size: 16 | lm loss: 6.744779E+00 | loss scale: 16384.0 | grad norm: 75497.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2402/ 159576 | consumed samples: 38432 | elapsed time per iteration (ms): 13599.2 | learning rate: 1.065E-05 | global batch size: 16 | lm loss: 7.107717E+00 | loss scale: 16384.0 | grad norm: 78343.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2403/ 159576 | consumed samples: 38448 | elapsed time per iteration (ms): 13623.1 | learning rate: 1.066E-05 | global batch size: 16 | lm loss: 6.897991E+00 | loss scale: 16384.0 | grad norm: 89054.642 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2404/ 159576 | consumed samples: 38464 | elapsed time per iteration (ms): 14088.2 | learning rate: 1.066E-05 | global batch size: 16 | lm loss: 6.915084E+00 | loss scale: 16384.0 | grad norm: 88153.392 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2405/ 159576 | consumed samples: 38480 | elapsed time per iteration (ms): 13711.7 | learning rate: 1.066E-05 | global batch size: 16 | lm loss: 6.791551E+00 | loss scale: 16384.0 | grad norm: 81047.565 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2406/ 159576 | consumed samples: 38496 | elapsed time per iteration (ms): 13659.9 | learning rate: 1.067E-05 | global batch size: 16 | lm loss: 6.768214E+00 | loss scale: 16384.0 | grad norm: 63942.069 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2407/ 159576 | consumed samples: 38512 | elapsed time per iteration (ms): 13659.5 | learning rate: 1.067E-05 | global batch size: 16 | lm loss: 6.785830E+00 | loss scale: 16384.0 | grad norm: 50544.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2408/ 159576 | consumed samples: 38528 | elapsed time per iteration (ms): 14010.2 | learning rate: 1.068E-05 | global batch size: 16 | lm loss: 6.781000E+00 | loss scale: 16384.0 | grad norm: 114170.359 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2409/ 159576 | consumed samples: 38544 | elapsed time per iteration (ms): 13587.7 | learning rate: 1.068E-05 | global batch size: 16 | lm loss: 6.876911E+00 | loss scale: 16384.0 | grad norm: 60235.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2410/ 159576 | consumed samples: 38560 | elapsed time per iteration (ms): 13605.6 | learning rate: 1.069E-05 | global batch size: 16 | lm loss: 6.837091E+00 | loss scale: 16384.0 | grad norm: 72387.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2411/ 159576 | consumed samples: 38576 | elapsed time per iteration (ms): 13675.7 | learning rate: 1.069E-05 | global batch size: 16 | lm loss: 6.912636E+00 | loss scale: 16384.0 | grad norm: 76432.994 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2412/ 159576 | consumed samples: 38592 | elapsed time per iteration (ms): 13569.6 | learning rate: 1.070E-05 | global batch size: 16 | lm loss: 6.712539E+00 | loss scale: 16384.0 | grad norm: 113832.300 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2413/ 159576 | consumed samples: 38608 | elapsed time per iteration (ms): 13932.9 | learning rate: 1.070E-05 | global batch size: 16 | lm loss: 6.804219E+00 | loss scale: 16384.0 | grad norm: 73073.257 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2414/ 159576 | consumed samples: 38624 | elapsed time per iteration (ms): 13742.1 | learning rate: 1.070E-05 | global batch size: 16 | lm loss: 6.947999E+00 | loss scale: 16384.0 | grad norm: 90599.997 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2415/ 159576 | consumed samples: 38640 | elapsed time per iteration (ms): 13556.3 | learning rate: 1.071E-05 | global batch size: 16 | lm loss: 7.002557E+00 | loss scale: 16384.0 | grad norm: 71840.830 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2416/ 159576 | consumed samples: 38656 | elapsed time per iteration (ms): 13593.5 | learning rate: 1.071E-05 | global batch size: 16 | lm loss: 6.920745E+00 | loss scale: 16384.0 | grad norm: 60284.538 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2417/ 159576 | consumed samples: 38672 | elapsed time per iteration (ms): 14084.6 | learning rate: 1.072E-05 | global batch size: 16 | lm loss: 7.137000E+00 | loss scale: 16384.0 | grad norm: 185539.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2418/ 159576 | consumed samples: 38688 | elapsed time per iteration (ms): 13641.5 | learning rate: 1.072E-05 | global batch size: 16 | lm loss: 6.757603E+00 | loss scale: 16384.0 | grad norm: 127319.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2419/ 159576 | consumed samples: 38704 | elapsed time per iteration (ms): 13580.1 | learning rate: 1.073E-05 | global batch size: 16 | lm loss: 6.869411E+00 | loss scale: 16384.0 | grad norm: 97709.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2420/ 159576 | consumed samples: 38720 | elapsed time per iteration (ms): 13629.2 | learning rate: 1.073E-05 | global batch size: 16 | lm loss: 6.709553E+00 | loss scale: 16384.0 | grad norm: 92144.986 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2421/ 159576 | consumed samples: 38736 | elapsed time per iteration (ms): 14151.6 | learning rate: 1.074E-05 | global batch size: 16 | lm loss: 6.884684E+00 | loss scale: 16384.0 | grad norm: 68698.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2422/ 159576 | consumed samples: 38752 | elapsed time per iteration (ms): 13613.5 | learning rate: 1.074E-05 | global batch size: 16 | lm loss: 6.869916E+00 | loss scale: 16384.0 | grad norm: 183504.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2423/ 159576 | consumed samples: 38768 | elapsed time per iteration (ms): 13633.7 | learning rate: 1.074E-05 | global batch size: 16 | lm loss: 6.890718E+00 | loss scale: 16384.0 | grad norm: 156548.776 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2424/ 159576 | consumed samples: 38784 | elapsed time per iteration (ms): 13607.9 | learning rate: 1.075E-05 | global batch size: 16 | lm loss: 6.935307E+00 | loss scale: 16384.0 | grad norm: 64330.150 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2425/ 159576 | consumed samples: 38800 | elapsed time per iteration (ms): 13605.4 | learning rate: 1.075E-05 | global batch size: 16 | lm loss: 6.766086E+00 | loss scale: 16384.0 | grad norm: 69465.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2426/ 159576 | consumed samples: 38816 | elapsed time per iteration (ms): 13928.6 | learning rate: 1.076E-05 | global batch size: 16 | lm loss: 7.066947E+00 | loss scale: 16384.0 | grad norm: 107634.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2427/ 159576 | consumed samples: 38832 | elapsed time per iteration (ms): 13650.1 | learning rate: 1.076E-05 | global batch size: 16 | lm loss: 7.050639E+00 | loss scale: 16384.0 | grad norm: 95342.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2428/ 159576 | consumed samples: 38848 | elapsed time per iteration (ms): 13681.2 | learning rate: 1.077E-05 | global batch size: 16 | lm loss: 6.855616E+00 | loss scale: 16384.0 | grad norm: 59595.304 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2429/ 159576 | consumed samples: 38864 | elapsed time per iteration (ms): 13695.9 | learning rate: 1.077E-05 | global batch size: 16 | lm loss: 7.041804E+00 | loss scale: 16384.0 | grad norm: 65131.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2430/ 159576 | consumed samples: 38880 | elapsed time per iteration (ms): 13962.7 | learning rate: 1.078E-05 | global batch size: 16 | lm loss: 6.803939E+00 | loss scale: 16384.0 | grad norm: 63269.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2431/ 159576 | consumed samples: 38896 | elapsed time per iteration (ms): 13583.2 | learning rate: 1.078E-05 | global batch size: 16 | lm loss: 6.876345E+00 | loss scale: 16384.0 | grad norm: 74949.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2432/ 159576 | consumed samples: 38912 | elapsed time per iteration (ms): 13606.6 | learning rate: 1.078E-05 | global batch size: 16 | lm loss: 6.916327E+00 | loss scale: 16384.0 | grad norm: 74586.629 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2433/ 159576 | consumed samples: 38928 | elapsed time per iteration (ms): 13607.5 | learning rate: 1.079E-05 | global batch size: 16 | lm loss: 6.779680E+00 | loss scale: 16384.0 | grad norm: 82519.547 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2434/ 159576 | consumed samples: 38944 | elapsed time per iteration (ms): 13894.0 | learning rate: 1.079E-05 | global batch size: 16 | lm loss: 6.903611E+00 | loss scale: 16384.0 | grad norm: 69004.549 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2435/ 159576 | consumed samples: 38960 | elapsed time per iteration (ms): 13779.1 | learning rate: 1.080E-05 | global batch size: 16 | lm loss: 6.630243E+00 | loss scale: 16384.0 | grad norm: 107197.604 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2436/ 159576 | consumed samples: 38976 | elapsed time per iteration (ms): 13659.0 | learning rate: 1.080E-05 | global batch size: 16 | lm loss: 6.876919E+00 | loss scale: 16384.0 | grad norm: 77407.687 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2437/ 159576 | consumed samples: 38992 | elapsed time per iteration (ms): 13553.5 | learning rate: 1.081E-05 | global batch size: 16 | lm loss: 6.728307E+00 | loss scale: 16384.0 | grad norm: 79645.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2438/ 159576 | consumed samples: 39008 | elapsed time per iteration (ms): 13664.0 | learning rate: 1.081E-05 | global batch size: 16 | lm loss: 6.923852E+00 | loss scale: 16384.0 | grad norm: 70221.677 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2439/ 159576 | consumed samples: 39024 | elapsed time per iteration (ms): 13814.4 | learning rate: 1.082E-05 | global batch size: 16 | lm loss: 6.729681E+00 | loss scale: 16384.0 | grad norm: 71734.084 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2440/ 159576 | consumed samples: 39040 | elapsed time per iteration (ms): 13667.6 | learning rate: 1.082E-05 | global batch size: 16 | lm loss: 6.668837E+00 | loss scale: 16384.0 | grad norm: 69995.202 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2441/ 159576 | consumed samples: 39056 | elapsed time per iteration (ms): 13617.8 | learning rate: 1.082E-05 | global batch size: 16 | lm loss: 6.781438E+00 | loss scale: 16384.0 | grad norm: 49304.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2442/ 159576 | consumed samples: 39072 | elapsed time per iteration (ms): 13652.0 | learning rate: 1.083E-05 | global batch size: 16 | lm loss: 6.810652E+00 | loss scale: 16384.0 | grad norm: 86564.989 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2443/ 159576 | consumed samples: 39088 | elapsed time per iteration (ms): 14063.1 | learning rate: 1.083E-05 | global batch size: 16 | lm loss: 6.879047E+00 | loss scale: 16384.0 | grad norm: 56659.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2444/ 159576 | consumed samples: 39104 | elapsed time per iteration (ms): 13586.9 | learning rate: 1.084E-05 | global batch size: 16 | lm loss: 6.494076E+00 | loss scale: 16384.0 | grad norm: 72585.008 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2445/ 159576 | consumed samples: 39120 | elapsed time per iteration (ms): 13676.6 | learning rate: 1.084E-05 | global batch size: 16 | lm loss: 6.713490E+00 | loss scale: 16384.0 | grad norm: 68348.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2446/ 159576 | consumed samples: 39136 | elapsed time per iteration (ms): 13706.8 | learning rate: 1.085E-05 | global batch size: 16 | lm loss: 6.970970E+00 | loss scale: 16384.0 | grad norm: 145461.809 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2447/ 159576 | consumed samples: 39152 | elapsed time per iteration (ms): 13581.7 | learning rate: 1.085E-05 | global batch size: 16 | lm loss: 6.777845E+00 | loss scale: 16384.0 | grad norm: 67935.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2448/ 159576 | consumed samples: 39168 | elapsed time per iteration (ms): 13810.2 | learning rate: 1.086E-05 | global batch size: 16 | lm loss: 6.772415E+00 | loss scale: 16384.0 | grad norm: 86835.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2449/ 159576 | consumed samples: 39184 | elapsed time per iteration (ms): 13641.6 | learning rate: 1.086E-05 | global batch size: 16 | lm loss: 6.901608E+00 | loss scale: 16384.0 | grad norm: 86381.928 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2450/ 159576 | consumed samples: 39200 | elapsed time per iteration (ms): 13577.4 | learning rate: 1.086E-05 | global batch size: 16 | lm loss: 6.923601E+00 | loss scale: 16384.0 | grad norm: 67065.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2451/ 159576 | consumed samples: 39216 | elapsed time per iteration (ms): 13656.8 | learning rate: 1.087E-05 | global batch size: 16 | lm loss: 6.635858E+00 | loss scale: 16384.0 | grad norm: 118766.424 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2452/ 159576 | consumed samples: 39232 | elapsed time per iteration (ms): 14182.2 | learning rate: 1.087E-05 | global batch size: 16 | lm loss: 6.798747E+00 | loss scale: 16384.0 | grad norm: 86778.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2453/ 159576 | consumed samples: 39248 | elapsed time per iteration (ms): 13794.7 | learning rate: 1.088E-05 | global batch size: 16 | lm loss: 6.934669E+00 | loss scale: 16384.0 | grad norm: 72867.559 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2454/ 159576 | consumed samples: 39264 | elapsed time per iteration (ms): 13649.1 | learning rate: 1.088E-05 | global batch size: 16 | lm loss: 6.689157E+00 | loss scale: 16384.0 | grad norm: 53809.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2455/ 159576 | consumed samples: 39280 | elapsed time per iteration (ms): 13619.0 | learning rate: 1.089E-05 | global batch size: 16 | lm loss: 6.797565E+00 | loss scale: 16384.0 | grad norm: 130277.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2456/ 159576 | consumed samples: 39296 | elapsed time per iteration (ms): 14036.7 | learning rate: 1.089E-05 | global batch size: 16 | lm loss: 6.919378E+00 | loss scale: 16384.0 | grad norm: 68731.938 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2457/ 159576 | consumed samples: 39312 | elapsed time per iteration (ms): 13656.3 | learning rate: 1.089E-05 | global batch size: 16 | lm loss: 6.658165E+00 | loss scale: 16384.0 | grad norm: 90782.352 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2458/ 159576 | consumed samples: 39328 | elapsed time per iteration (ms): 13635.5 | learning rate: 1.090E-05 | global batch size: 16 | lm loss: 6.614546E+00 | loss scale: 16384.0 | grad norm: 80319.945 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2459/ 159576 | consumed samples: 39344 | elapsed time per iteration (ms): 13648.3 | learning rate: 1.090E-05 | global batch size: 16 | lm loss: 6.813863E+00 | loss scale: 16384.0 | grad norm: 96291.265 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2460/ 159576 | consumed samples: 39360 | elapsed time per iteration (ms): 13655.8 | learning rate: 1.091E-05 | global batch size: 16 | lm loss: 7.162710E+00 | loss scale: 16384.0 | grad norm: 58863.008 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2461/ 159576 | consumed samples: 39376 | elapsed time per iteration (ms): 13960.2 | learning rate: 1.091E-05 | global batch size: 16 | lm loss: 6.991768E+00 | loss scale: 16384.0 | grad norm: 72538.165 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2462/ 159576 | consumed samples: 39392 | elapsed time per iteration (ms): 13649.7 | learning rate: 1.092E-05 | global batch size: 16 | lm loss: 6.712080E+00 | loss scale: 16384.0 | grad norm: 76061.911 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2463/ 159576 | consumed samples: 39408 | elapsed time per iteration (ms): 13665.9 | learning rate: 1.092E-05 | global batch size: 16 | lm loss: 6.697587E+00 | loss scale: 16384.0 | grad norm: 78444.184 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2464/ 159576 | consumed samples: 39424 | elapsed time per iteration (ms): 13548.3 | learning rate: 1.093E-05 | global batch size: 16 | lm loss: 6.767040E+00 | loss scale: 16384.0 | grad norm: 71114.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2465/ 159576 | consumed samples: 39440 | elapsed time per iteration (ms): 13972.6 | learning rate: 1.093E-05 | global batch size: 16 | lm loss: 6.750882E+00 | loss scale: 16384.0 | grad norm: 60498.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2466/ 159576 | consumed samples: 39456 | elapsed time per iteration (ms): 13657.9 | learning rate: 1.093E-05 | global batch size: 16 | lm loss: 6.631062E+00 | loss scale: 16384.0 | grad norm: 75019.075 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2467/ 159576 | consumed samples: 39472 | elapsed time per iteration (ms): 13692.3 | learning rate: 1.094E-05 | global batch size: 16 | lm loss: 6.725332E+00 | loss scale: 16384.0 | grad norm: 53922.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2468/ 159576 | consumed samples: 39488 | elapsed time per iteration (ms): 13656.1 | learning rate: 1.094E-05 | global batch size: 16 | lm loss: 6.736504E+00 | loss scale: 16384.0 | grad norm: 54250.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2469/ 159576 | consumed samples: 39504 | elapsed time per iteration (ms): 14009.1 | learning rate: 1.095E-05 | global batch size: 16 | lm loss: 6.881338E+00 | loss scale: 16384.0 | grad norm: 64641.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2470/ 159576 | consumed samples: 39520 | elapsed time per iteration (ms): 13853.1 | learning rate: 1.095E-05 | global batch size: 16 | lm loss: 6.742140E+00 | loss scale: 16384.0 | grad norm: 52195.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2471/ 159576 | consumed samples: 39536 | elapsed time per iteration (ms): 13541.2 | learning rate: 1.096E-05 | global batch size: 16 | lm loss: 6.830609E+00 | loss scale: 16384.0 | grad norm: 98883.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2472/ 159576 | consumed samples: 39552 | elapsed time per iteration (ms): 13618.7 | learning rate: 1.096E-05 | global batch size: 16 | lm loss: 6.770423E+00 | loss scale: 16384.0 | grad norm: 66896.725 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2473/ 159576 | consumed samples: 39568 | elapsed time per iteration (ms): 13623.5 | learning rate: 1.097E-05 | global batch size: 16 | lm loss: 6.926878E+00 | loss scale: 16384.0 | grad norm: 74406.160 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2474/ 159576 | consumed samples: 39584 | elapsed time per iteration (ms): 14089.9 | learning rate: 1.097E-05 | global batch size: 16 | lm loss: 6.834147E+00 | loss scale: 16384.0 | grad norm: 61442.184 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2475/ 159576 | consumed samples: 39600 | elapsed time per iteration (ms): 13713.9 | learning rate: 1.097E-05 | global batch size: 16 | lm loss: 6.711390E+00 | loss scale: 16384.0 | grad norm: 72993.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2476/ 159576 | consumed samples: 39616 | elapsed time per iteration (ms): 13666.0 | learning rate: 1.098E-05 | global batch size: 16 | lm loss: 6.715760E+00 | loss scale: 16384.0 | grad norm: 54753.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2477/ 159576 | consumed samples: 39632 | elapsed time per iteration (ms): 13628.3 | learning rate: 1.098E-05 | global batch size: 16 | lm loss: 7.034068E+00 | loss scale: 16384.0 | grad norm: 65362.654 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2478/ 159576 | consumed samples: 39648 | elapsed time per iteration (ms): 14016.3 | learning rate: 1.099E-05 | global batch size: 16 | lm loss: 6.848239E+00 | loss scale: 16384.0 | grad norm: 59886.005 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2479/ 159576 | consumed samples: 39664 | elapsed time per iteration (ms): 13518.2 | learning rate: 1.099E-05 | global batch size: 16 | lm loss: 6.766425E+00 | loss scale: 32768.0 | grad norm: 47600.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2480/ 159576 | consumed samples: 39680 | elapsed time per iteration (ms): 13611.4 | learning rate: 1.100E-05 | global batch size: 16 | lm loss: 6.569361E+00 | loss scale: 32768.0 | grad norm: 173183.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2481/ 159576 | consumed samples: 39696 | elapsed time per iteration (ms): 13649.6 | learning rate: 1.100E-05 | global batch size: 16 | lm loss: 6.977244E+00 | loss scale: 32768.0 | grad norm: 114608.298 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2482/ 159576 | consumed samples: 39712 | elapsed time per iteration (ms): 13592.7 | learning rate: 1.101E-05 | global batch size: 16 | lm loss: 6.743002E+00 | loss scale: 32768.0 | grad norm: 157122.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2483/ 159576 | consumed samples: 39728 | elapsed time per iteration (ms): 13957.3 | learning rate: 1.101E-05 | global batch size: 16 | lm loss: 6.786878E+00 | loss scale: 32768.0 | grad norm: 124608.544 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2484/ 159576 | consumed samples: 39744 | elapsed time per iteration (ms): 13654.6 | learning rate: 1.101E-05 | global batch size: 16 | lm loss: 6.859965E+00 | loss scale: 32768.0 | grad norm: 232222.713 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2485/ 159576 | consumed samples: 39760 | elapsed time per iteration (ms): 13613.9 | learning rate: 1.102E-05 | global batch size: 16 | lm loss: 6.802356E+00 | loss scale: 32768.0 | grad norm: 156829.946 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2486/ 159576 | consumed samples: 39776 | elapsed time per iteration (ms): 13653.4 | learning rate: 1.102E-05 | global batch size: 16 | lm loss: 6.710648E+00 | loss scale: 32768.0 | grad norm: 134523.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2487/ 159576 | consumed samples: 39792 | elapsed time per iteration (ms): 14072.7 | learning rate: 1.103E-05 | global batch size: 16 | lm loss: 6.797608E+00 | loss scale: 32768.0 | grad norm: 125011.237 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2488/ 159576 | consumed samples: 39808 | elapsed time per iteration (ms): 13639.9 | learning rate: 1.103E-05 | global batch size: 16 | lm loss: 6.854223E+00 | loss scale: 32768.0 | grad norm: 260551.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2489/ 159576 | consumed samples: 39824 | elapsed time per iteration (ms): 13577.6 | learning rate: 1.104E-05 | global batch size: 16 | lm loss: 6.603992E+00 | loss scale: 32768.0 | grad norm: 181893.334 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2490/ 159576 | consumed samples: 39840 | elapsed time per iteration (ms): 13675.7 | learning rate: 1.104E-05 | global batch size: 16 | lm loss: 6.694830E+00 | loss scale: 32768.0 | grad norm: 141757.675 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2491/ 159576 | consumed samples: 39856 | elapsed time per iteration (ms): 14083.9 | learning rate: 1.105E-05 | global batch size: 16 | lm loss: 6.642892E+00 | loss scale: 32768.0 | grad norm: 119287.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2492/ 159576 | consumed samples: 39872 | elapsed time per iteration (ms): 13603.6 | learning rate: 1.105E-05 | global batch size: 16 | lm loss: 6.801910E+00 | loss scale: 32768.0 | grad norm: 155539.404 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2493/ 159576 | consumed samples: 39888 | elapsed time per iteration (ms): 13598.7 | learning rate: 1.105E-05 | global batch size: 16 | lm loss: 6.791874E+00 | loss scale: 32768.0 | grad norm: 122407.998 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2494/ 159576 | consumed samples: 39904 | elapsed time per iteration (ms): 13643.8 | learning rate: 1.106E-05 | global batch size: 16 | lm loss: 6.826643E+00 | loss scale: 32768.0 | grad norm: 128586.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2495/ 159576 | consumed samples: 39920 | elapsed time per iteration (ms): 13584.0 | learning rate: 1.106E-05 | global batch size: 16 | lm loss: 6.715306E+00 | loss scale: 32768.0 | grad norm: 99484.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2496/ 159576 | consumed samples: 39936 | elapsed time per iteration (ms): 13754.1 | learning rate: 1.107E-05 | global batch size: 16 | lm loss: 6.833625E+00 | loss scale: 32768.0 | grad norm: 115202.668 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2497/ 159576 | consumed samples: 39952 | elapsed time per iteration (ms): 13634.3 | learning rate: 1.107E-05 | global batch size: 16 | lm loss: 6.915625E+00 | loss scale: 32768.0 | grad norm: 186838.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2498/ 159576 | consumed samples: 39968 | elapsed time per iteration (ms): 13644.0 | learning rate: 1.108E-05 | global batch size: 16 | lm loss: 6.967087E+00 | loss scale: 32768.0 | grad norm: 131122.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2499/ 159576 | consumed samples: 39984 | elapsed time per iteration (ms): 13681.7 | learning rate: 1.108E-05 | global batch size: 16 | lm loss: 6.760918E+00 | loss scale: 32768.0 | grad norm: 194624.256 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2500/ 159576 | consumed samples: 40000 | elapsed time per iteration (ms): 14007.6 | learning rate: 1.109E-05 | global batch size: 16 | lm loss: 6.979738E+00 | loss scale: 32768.0 | grad norm: 156689.771 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2501/ 159576 | consumed samples: 40016 | elapsed time per iteration (ms): 13617.5 | learning rate: 1.109E-05 | global batch size: 16 | lm loss: 6.789479E+00 | loss scale: 32768.0 | grad norm: 144780.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2502/ 159576 | consumed samples: 40032 | elapsed time per iteration (ms): 13599.5 | learning rate: 1.109E-05 | global batch size: 16 | lm loss: 6.864005E+00 | loss scale: 32768.0 | grad norm: 170229.489 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2503/ 159576 | consumed samples: 40048 | elapsed time per iteration (ms): 13573.2 | learning rate: 1.110E-05 | global batch size: 16 | lm loss: 6.666573E+00 | loss scale: 32768.0 | grad norm: 146264.627 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2504/ 159576 | consumed samples: 40064 | elapsed time per iteration (ms): 13981.7 | learning rate: 1.110E-05 | global batch size: 16 | lm loss: 6.757555E+00 | loss scale: 32768.0 | grad norm: 194432.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2505/ 159576 | consumed samples: 40080 | elapsed time per iteration (ms): 13815.5 | learning rate: 1.111E-05 | global batch size: 16 | lm loss: 7.060199E+00 | loss scale: 32768.0 | grad norm: 107664.354 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2506/ 159576 | consumed samples: 40096 | elapsed time per iteration (ms): 13708.3 | learning rate: 1.111E-05 | global batch size: 16 | lm loss: 6.757818E+00 | loss scale: 32768.0 | grad norm: 172391.067 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2507/ 159576 | consumed samples: 40112 | elapsed time per iteration (ms): 13682.1 | learning rate: 1.112E-05 | global batch size: 16 | lm loss: 6.957751E+00 | loss scale: 32768.0 | grad norm: 153732.331 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2508/ 159576 | consumed samples: 40128 | elapsed time per iteration (ms): 13651.8 | learning rate: 1.112E-05 | global batch size: 16 | lm loss: 6.697278E+00 | loss scale: 32768.0 | grad norm: 269873.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2509/ 159576 | consumed samples: 40144 | elapsed time per iteration (ms): 13847.8 | learning rate: 1.113E-05 | global batch size: 16 | lm loss: 6.915687E+00 | loss scale: 32768.0 | grad norm: 203672.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2510/ 159576 | consumed samples: 40160 | elapsed time per iteration (ms): 13726.7 | learning rate: 1.113E-05 | global batch size: 16 | lm loss: 6.563999E+00 | loss scale: 32768.0 | grad norm: 156793.595 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2511/ 159576 | consumed samples: 40176 | elapsed time per iteration (ms): 13592.8 | learning rate: 1.113E-05 | global batch size: 16 | lm loss: 6.816392E+00 | loss scale: 32768.0 | grad norm: 174319.403 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2512/ 159576 | consumed samples: 40192 | elapsed time per iteration (ms): 13663.1 | learning rate: 1.114E-05 | global batch size: 16 | lm loss: 6.610006E+00 | loss scale: 32768.0 | grad norm: 205941.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2513/ 159576 | consumed samples: 40208 | elapsed time per iteration (ms): 13997.4 | learning rate: 1.114E-05 | global batch size: 16 | lm loss: 6.968318E+00 | loss scale: 32768.0 | grad norm: 198426.978 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2514/ 159576 | consumed samples: 40224 | elapsed time per iteration (ms): 13639.5 | learning rate: 1.115E-05 | global batch size: 16 | lm loss: 6.754237E+00 | loss scale: 32768.0 | grad norm: 150994.686 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2515/ 159576 | consumed samples: 40240 | elapsed time per iteration (ms): 13721.6 | learning rate: 1.115E-05 | global batch size: 16 | lm loss: 6.780080E+00 | loss scale: 32768.0 | grad norm: 221933.544 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2516/ 159576 | consumed samples: 40256 | elapsed time per iteration (ms): 13588.8 | learning rate: 1.116E-05 | global batch size: 16 | lm loss: 7.005465E+00 | loss scale: 32768.0 | grad norm: 111981.898 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2517/ 159576 | consumed samples: 40272 | elapsed time per iteration (ms): 13636.9 | learning rate: 1.116E-05 | global batch size: 16 | lm loss: 7.038844E+00 | loss scale: 32768.0 | grad norm: 207331.802 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2518/ 159576 | consumed samples: 40288 | elapsed time per iteration (ms): 13872.4 | learning rate: 1.117E-05 | global batch size: 16 | lm loss: 6.753989E+00 | loss scale: 32768.0 | grad norm: 152725.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2519/ 159576 | consumed samples: 40304 | elapsed time per iteration (ms): 13607.9 | learning rate: 1.117E-05 | global batch size: 16 | lm loss: 6.981558E+00 | loss scale: 32768.0 | grad norm: 154949.465 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2520/ 159576 | consumed samples: 40320 | elapsed time per iteration (ms): 13684.9 | learning rate: 1.117E-05 | global batch size: 16 | lm loss: 6.906241E+00 | loss scale: 32768.0 | grad norm: 125549.575 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2521/ 159576 | consumed samples: 40336 | elapsed time per iteration (ms): 13716.2 | learning rate: 1.118E-05 | global batch size: 16 | lm loss: 6.747027E+00 | loss scale: 32768.0 | grad norm: 122780.845 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2522/ 159576 | consumed samples: 40352 | elapsed time per iteration (ms): 14167.1 | learning rate: 1.118E-05 | global batch size: 16 | lm loss: 6.970352E+00 | loss scale: 32768.0 | grad norm: 118819.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2523/ 159576 | consumed samples: 40368 | elapsed time per iteration (ms): 13664.4 | learning rate: 1.119E-05 | global batch size: 16 | lm loss: 6.714174E+00 | loss scale: 32768.0 | grad norm: 146027.986 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2524/ 159576 | consumed samples: 40384 | elapsed time per iteration (ms): 13630.7 | learning rate: 1.119E-05 | global batch size: 16 | lm loss: 6.610335E+00 | loss scale: 32768.0 | grad norm: 242081.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2525/ 159576 | consumed samples: 40400 | elapsed time per iteration (ms): 13685.5 | learning rate: 1.120E-05 | global batch size: 16 | lm loss: 6.889633E+00 | loss scale: 32768.0 | grad norm: 125371.781 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2526/ 159576 | consumed samples: 40416 | elapsed time per iteration (ms): 13989.6 | learning rate: 1.120E-05 | global batch size: 16 | lm loss: 6.703308E+00 | loss scale: 32768.0 | grad norm: 229244.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2527/ 159576 | consumed samples: 40432 | elapsed time per iteration (ms): 13653.7 | learning rate: 1.121E-05 | global batch size: 16 | lm loss: 6.903625E+00 | loss scale: 32768.0 | grad norm: 180615.201 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2528/ 159576 | consumed samples: 40448 | elapsed time per iteration (ms): 13688.8 | learning rate: 1.121E-05 | global batch size: 16 | lm loss: 6.882591E+00 | loss scale: 32768.0 | grad norm: 123446.214 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2529/ 159576 | consumed samples: 40464 | elapsed time per iteration (ms): 13727.9 | learning rate: 1.121E-05 | global batch size: 16 | lm loss: 6.771068E+00 | loss scale: 32768.0 | grad norm: 136122.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2530/ 159576 | consumed samples: 40480 | elapsed time per iteration (ms): 13727.3 | learning rate: 1.122E-05 | global batch size: 16 | lm loss: 6.839997E+00 | loss scale: 32768.0 | grad norm: 198759.749 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2531/ 159576 | consumed samples: 40496 | elapsed time per iteration (ms): 13882.2 | learning rate: 1.122E-05 | global batch size: 16 | lm loss: 6.934726E+00 | loss scale: 32768.0 | grad norm: 140393.181 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2532/ 159576 | consumed samples: 40512 | elapsed time per iteration (ms): 13707.7 | learning rate: 1.123E-05 | global batch size: 16 | lm loss: 6.824786E+00 | loss scale: 32768.0 | grad norm: 136497.509 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2533/ 159576 | consumed samples: 40528 | elapsed time per iteration (ms): 13668.7 | learning rate: 1.123E-05 | global batch size: 16 | lm loss: 6.638996E+00 | loss scale: 32768.0 | grad norm: 108086.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2534/ 159576 | consumed samples: 40544 | elapsed time per iteration (ms): 13600.7 | learning rate: 1.124E-05 | global batch size: 16 | lm loss: 6.684957E+00 | loss scale: 32768.0 | grad norm: 136205.291 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2535/ 159576 | consumed samples: 40560 | elapsed time per iteration (ms): 14008.2 | learning rate: 1.124E-05 | global batch size: 16 | lm loss: 6.650595E+00 | loss scale: 32768.0 | grad norm: 89458.356 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2536/ 159576 | consumed samples: 40576 | elapsed time per iteration (ms): 13696.2 | learning rate: 1.125E-05 | global batch size: 16 | lm loss: 6.720654E+00 | loss scale: 32768.0 | grad norm: 207949.897 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2537/ 159576 | consumed samples: 40592 | elapsed time per iteration (ms): 13728.0 | learning rate: 1.125E-05 | global batch size: 16 | lm loss: 6.934484E+00 | loss scale: 32768.0 | grad norm: 145165.262 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2538/ 159576 | consumed samples: 40608 | elapsed time per iteration (ms): 13707.3 | learning rate: 1.125E-05 | global batch size: 16 | lm loss: 6.659933E+00 | loss scale: 32768.0 | grad norm: 109227.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2539/ 159576 | consumed samples: 40624 | elapsed time per iteration (ms): 14115.0 | learning rate: 1.126E-05 | global batch size: 16 | lm loss: 6.638377E+00 | loss scale: 32768.0 | grad norm: 221623.574 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2540/ 159576 | consumed samples: 40640 | elapsed time per iteration (ms): 13557.7 | learning rate: 1.126E-05 | global batch size: 16 | lm loss: 6.825821E+00 | loss scale: 32768.0 | grad norm: 114656.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2541/ 159576 | consumed samples: 40656 | elapsed time per iteration (ms): 13635.6 | learning rate: 1.127E-05 | global batch size: 16 | lm loss: 6.869952E+00 | loss scale: 32768.0 | grad norm: 204975.764 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2542/ 159576 | consumed samples: 40672 | elapsed time per iteration (ms): 13682.2 | learning rate: 1.127E-05 | global batch size: 16 | lm loss: 6.829473E+00 | loss scale: 32768.0 | grad norm: 158875.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2543/ 159576 | consumed samples: 40688 | elapsed time per iteration (ms): 13675.9 | learning rate: 1.128E-05 | global batch size: 16 | lm loss: 6.921135E+00 | loss scale: 32768.0 | grad norm: 248424.787 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2544/ 159576 | consumed samples: 40704 | elapsed time per iteration (ms): 14035.2 | learning rate: 1.128E-05 | global batch size: 16 | lm loss: 6.734321E+00 | loss scale: 32768.0 | grad norm: 137358.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2545/ 159576 | consumed samples: 40720 | elapsed time per iteration (ms): 13685.4 | learning rate: 1.129E-05 | global batch size: 16 | lm loss: 6.824071E+00 | loss scale: 32768.0 | grad norm: 172473.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2546/ 159576 | consumed samples: 40736 | elapsed time per iteration (ms): 13704.2 | learning rate: 1.129E-05 | global batch size: 16 | lm loss: 6.741428E+00 | loss scale: 32768.0 | grad norm: 117821.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2547/ 159576 | consumed samples: 40752 | elapsed time per iteration (ms): 13625.1 | learning rate: 1.129E-05 | global batch size: 16 | lm loss: 6.825446E+00 | loss scale: 32768.0 | grad norm: 302813.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2548/ 159576 | consumed samples: 40768 | elapsed time per iteration (ms): 13978.9 | learning rate: 1.130E-05 | global batch size: 16 | lm loss: 6.930991E+00 | loss scale: 32768.0 | grad norm: 163222.779 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2549/ 159576 | consumed samples: 40784 | elapsed time per iteration (ms): 13605.2 | learning rate: 1.130E-05 | global batch size: 16 | lm loss: 6.901045E+00 | loss scale: 32768.0 | grad norm: 178776.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2550/ 159576 | consumed samples: 40800 | elapsed time per iteration (ms): 13704.5 | learning rate: 1.131E-05 | global batch size: 16 | lm loss: 6.923467E+00 | loss scale: 32768.0 | grad norm: 156500.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2551/ 159576 | consumed samples: 40816 | elapsed time per iteration (ms): 13642.0 | learning rate: 1.131E-05 | global batch size: 16 | lm loss: 6.698053E+00 | loss scale: 32768.0 | grad norm: 142885.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2552/ 159576 | consumed samples: 40832 | elapsed time per iteration (ms): 13988.3 | learning rate: 1.132E-05 | global batch size: 16 | lm loss: 6.774540E+00 | loss scale: 32768.0 | grad norm: 236886.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2553/ 159576 | consumed samples: 40848 | elapsed time per iteration (ms): 13862.8 | learning rate: 1.132E-05 | global batch size: 16 | lm loss: 6.706432E+00 | loss scale: 32768.0 | grad norm: 178546.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2554/ 159576 | consumed samples: 40864 | elapsed time per iteration (ms): 13629.3 | learning rate: 1.133E-05 | global batch size: 16 | lm loss: 6.631795E+00 | loss scale: 32768.0 | grad norm: 176739.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2555/ 159576 | consumed samples: 40880 | elapsed time per iteration (ms): 13608.3 | learning rate: 1.133E-05 | global batch size: 16 | lm loss: 7.180985E+00 | loss scale: 32768.0 | grad norm: 132584.462 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2556/ 159576 | consumed samples: 40896 | elapsed time per iteration (ms): 13580.0 | learning rate: 1.133E-05 | global batch size: 16 | lm loss: 6.838911E+00 | loss scale: 32768.0 | grad norm: 90158.811 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2557/ 159576 | consumed samples: 40912 | elapsed time per iteration (ms): 13942.7 | learning rate: 1.134E-05 | global batch size: 16 | lm loss: 6.693833E+00 | loss scale: 32768.0 | grad norm: 220674.059 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2558/ 159576 | consumed samples: 40928 | elapsed time per iteration (ms): 13802.7 | learning rate: 1.134E-05 | global batch size: 16 | lm loss: 6.568502E+00 | loss scale: 32768.0 | grad norm: 98298.873 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2559/ 159576 | consumed samples: 40944 | elapsed time per iteration (ms): 13641.3 | learning rate: 1.135E-05 | global batch size: 16 | lm loss: 6.635581E+00 | loss scale: 32768.0 | grad norm: 169974.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2560/ 159576 | consumed samples: 40960 | elapsed time per iteration (ms): 13704.3 | learning rate: 1.135E-05 | global batch size: 16 | lm loss: 6.565581E+00 | loss scale: 32768.0 | grad norm: 129387.649 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2561/ 159576 | consumed samples: 40976 | elapsed time per iteration (ms): 14001.7 | learning rate: 1.136E-05 | global batch size: 16 | lm loss: 6.892058E+00 | loss scale: 32768.0 | grad norm: 339367.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2562/ 159576 | consumed samples: 40992 | elapsed time per iteration (ms): 13513.6 | learning rate: 1.136E-05 | global batch size: 16 | lm loss: 6.762362E+00 | loss scale: 32768.0 | grad norm: 232794.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2563/ 159576 | consumed samples: 41008 | elapsed time per iteration (ms): 13601.0 | learning rate: 1.137E-05 | global batch size: 16 | lm loss: 6.843441E+00 | loss scale: 32768.0 | grad norm: 163664.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2564/ 159576 | consumed samples: 41024 | elapsed time per iteration (ms): 13594.8 | learning rate: 1.137E-05 | global batch size: 16 | lm loss: 6.819015E+00 | loss scale: 32768.0 | grad norm: 216339.488 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2565/ 159576 | consumed samples: 41040 | elapsed time per iteration (ms): 13605.6 | learning rate: 1.137E-05 | global batch size: 16 | lm loss: 6.897832E+00 | loss scale: 32768.0 | grad norm: 109607.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2566/ 159576 | consumed samples: 41056 | elapsed time per iteration (ms): 13861.7 | learning rate: 1.138E-05 | global batch size: 16 | lm loss: 6.875384E+00 | loss scale: 32768.0 | grad norm: 402667.413 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2567/ 159576 | consumed samples: 41072 | elapsed time per iteration (ms): 13673.7 | learning rate: 1.138E-05 | global batch size: 16 | lm loss: 6.761830E+00 | loss scale: 32768.0 | grad norm: 270664.955 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2568/ 159576 | consumed samples: 41088 | elapsed time per iteration (ms): 13672.1 | learning rate: 1.139E-05 | global batch size: 16 | lm loss: 6.792128E+00 | loss scale: 32768.0 | grad norm: 176693.891 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2569/ 159576 | consumed samples: 41104 | elapsed time per iteration (ms): 13609.9 | learning rate: 1.139E-05 | global batch size: 16 | lm loss: 6.935215E+00 | loss scale: 32768.0 | grad norm: 182679.180 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 12:06:15] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 12:06:15] PULSE: tr8-104B is running for 6:14:04 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 2570/ 159576 | consumed samples: 41120 | elapsed time per iteration (ms): 14149.7 | learning rate: 1.140E-05 | global batch size: 16 | lm loss: 6.826759E+00 | loss scale: 32768.0 | grad norm: 135711.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2571/ 159576 | consumed samples: 41136 | elapsed time per iteration (ms): 13749.2 | learning rate: 1.140E-05 | global batch size: 16 | lm loss: 6.600703E+00 | loss scale: 32768.0 | grad norm: 143461.893 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2572/ 159576 | consumed samples: 41152 | elapsed time per iteration (ms): 13601.5 | learning rate: 1.141E-05 | global batch size: 16 | lm loss: 6.747102E+00 | loss scale: 32768.0 | grad norm: 205480.052 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2573/ 159576 | consumed samples: 41168 | elapsed time per iteration (ms): 13680.7 | learning rate: 1.141E-05 | global batch size: 16 | lm loss: 6.767237E+00 | loss scale: 32768.0 | grad norm: 186807.581 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2574/ 159576 | consumed samples: 41184 | elapsed time per iteration (ms): 14103.7 | learning rate: 1.141E-05 | global batch size: 16 | lm loss: 6.786840E+00 | loss scale: 32768.0 | grad norm: 125986.096 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2575/ 159576 | consumed samples: 41200 | elapsed time per iteration (ms): 13634.6 | learning rate: 1.142E-05 | global batch size: 16 | lm loss: 6.740016E+00 | loss scale: 32768.0 | grad norm: 127578.945 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2576/ 159576 | consumed samples: 41216 | elapsed time per iteration (ms): 13632.4 | learning rate: 1.142E-05 | global batch size: 16 | lm loss: 6.717787E+00 | loss scale: 32768.0 | grad norm: 91352.288 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2577/ 159576 | consumed samples: 41232 | elapsed time per iteration (ms): 13613.7 | learning rate: 1.143E-05 | global batch size: 16 | lm loss: 6.736307E+00 | loss scale: 32768.0 | grad norm: 161126.891 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2578/ 159576 | consumed samples: 41248 | elapsed time per iteration (ms): 13501.7 | learning rate: 1.143E-05 | global batch size: 16 | lm loss: 6.725785E+00 | loss scale: 32768.0 | grad norm: 105065.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2579/ 159576 | consumed samples: 41264 | elapsed time per iteration (ms): 13746.0 | learning rate: 1.144E-05 | global batch size: 16 | lm loss: 6.731723E+00 | loss scale: 32768.0 | grad norm: 123413.248 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2580/ 159576 | consumed samples: 41280 | elapsed time per iteration (ms): 13621.8 | learning rate: 1.144E-05 | global batch size: 16 | lm loss: 6.889888E+00 | loss scale: 32768.0 | grad norm: 128934.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2581/ 159576 | consumed samples: 41296 | elapsed time per iteration (ms): 13634.3 | learning rate: 1.145E-05 | global batch size: 16 | lm loss: 6.845993E+00 | loss scale: 32768.0 | grad norm: 140353.622 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2582/ 159576 | consumed samples: 41312 | elapsed time per iteration (ms): 13645.1 | learning rate: 1.145E-05 | global batch size: 16 | lm loss: 6.922751E+00 | loss scale: 32768.0 | grad norm: 193649.510 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2583/ 159576 | consumed samples: 41328 | elapsed time per iteration (ms): 14012.6 | learning rate: 1.145E-05 | global batch size: 16 | lm loss: 6.706060E+00 | loss scale: 32768.0 | grad norm: 120536.730 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2584/ 159576 | consumed samples: 41344 | elapsed time per iteration (ms): 13567.7 | learning rate: 1.146E-05 | global batch size: 16 | lm loss: 6.729124E+00 | loss scale: 32768.0 | grad norm: 150036.593 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2585/ 159576 | consumed samples: 41360 | elapsed time per iteration (ms): 13534.2 | learning rate: 1.146E-05 | global batch size: 16 | lm loss: 6.841982E+00 | loss scale: 32768.0 | grad norm: 169788.083 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2586/ 159576 | consumed samples: 41376 | elapsed time per iteration (ms): 13556.0 | learning rate: 1.147E-05 | global batch size: 16 | lm loss: 6.813578E+00 | loss scale: 32768.0 | grad norm: 120615.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2587/ 159576 | consumed samples: 41392 | elapsed time per iteration (ms): 13668.2 | learning rate: 1.147E-05 | global batch size: 16 | lm loss: 6.675393E+00 | loss scale: 32768.0 | grad norm: 202372.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2588/ 159576 | consumed samples: 41408 | elapsed time per iteration (ms): 13867.2 | learning rate: 1.148E-05 | global batch size: 16 | lm loss: 6.796386E+00 | loss scale: 32768.0 | grad norm: 131901.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2589/ 159576 | consumed samples: 41424 | elapsed time per iteration (ms): 13636.7 | learning rate: 1.148E-05 | global batch size: 16 | lm loss: 6.783171E+00 | loss scale: 32768.0 | grad norm: 127655.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2590/ 159576 | consumed samples: 41440 | elapsed time per iteration (ms): 13677.9 | learning rate: 1.149E-05 | global batch size: 16 | lm loss: 6.672108E+00 | loss scale: 32768.0 | grad norm: 111803.439 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2591/ 159576 | consumed samples: 41456 | elapsed time per iteration (ms): 13670.0 | learning rate: 1.149E-05 | global batch size: 16 | lm loss: 6.894643E+00 | loss scale: 32768.0 | grad norm: 156503.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2592/ 159576 | consumed samples: 41472 | elapsed time per iteration (ms): 14137.5 | learning rate: 1.149E-05 | global batch size: 16 | lm loss: 6.765024E+00 | loss scale: 32768.0 | grad norm: 160594.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2593/ 159576 | consumed samples: 41488 | elapsed time per iteration (ms): 13635.7 | learning rate: 1.150E-05 | global batch size: 16 | lm loss: 6.882227E+00 | loss scale: 32768.0 | grad norm: 142008.845 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2594/ 159576 | consumed samples: 41504 | elapsed time per iteration (ms): 13592.8 | learning rate: 1.150E-05 | global batch size: 16 | lm loss: 6.750668E+00 | loss scale: 32768.0 | grad norm: 137376.665 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2595/ 159576 | consumed samples: 41520 | elapsed time per iteration (ms): 13572.7 | learning rate: 1.151E-05 | global batch size: 16 | lm loss: 6.870511E+00 | loss scale: 32768.0 | grad norm: 203139.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2596/ 159576 | consumed samples: 41536 | elapsed time per iteration (ms): 13955.3 | learning rate: 1.151E-05 | global batch size: 16 | lm loss: 6.952578E+00 | loss scale: 32768.0 | grad norm: 259660.982 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2597/ 159576 | consumed samples: 41552 | elapsed time per iteration (ms): 13711.6 | learning rate: 1.152E-05 | global batch size: 16 | lm loss: 6.681178E+00 | loss scale: 32768.0 | grad norm: 126907.178 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2598/ 159576 | consumed samples: 41568 | elapsed time per iteration (ms): 13707.8 | learning rate: 1.152E-05 | global batch size: 16 | lm loss: 6.610268E+00 | loss scale: 32768.0 | grad norm: 135897.348 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2599/ 159576 | consumed samples: 41584 | elapsed time per iteration (ms): 13564.4 | learning rate: 1.153E-05 | global batch size: 16 | lm loss: 6.826151E+00 | loss scale: 32768.0 | grad norm: 155911.584 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2600/ 159576 | consumed samples: 41600 | elapsed time per iteration (ms): 13546.1 | learning rate: 1.153E-05 | global batch size: 16 | lm loss: 6.632576E+00 | loss scale: 32768.0 | grad norm: 252409.904 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2601/ 159576 | consumed samples: 41616 | elapsed time per iteration (ms): 13887.8 | learning rate: 1.153E-05 | global batch size: 16 | lm loss: 6.631788E+00 | loss scale: 32768.0 | grad norm: 165940.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2602/ 159576 | consumed samples: 41632 | elapsed time per iteration (ms): 13567.8 | learning rate: 1.154E-05 | global batch size: 16 | lm loss: 6.939396E+00 | loss scale: 32768.0 | grad norm: 124805.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2603/ 159576 | consumed samples: 41648 | elapsed time per iteration (ms): 13581.4 | learning rate: 1.154E-05 | global batch size: 16 | lm loss: 6.924129E+00 | loss scale: 32768.0 | grad norm: 133938.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2604/ 159576 | consumed samples: 41664 | elapsed time per iteration (ms): 13613.2 | learning rate: 1.155E-05 | global batch size: 16 | lm loss: 6.660190E+00 | loss scale: 32768.0 | grad norm: 188689.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2605/ 159576 | consumed samples: 41680 | elapsed time per iteration (ms): 14144.8 | learning rate: 1.155E-05 | global batch size: 16 | lm loss: 6.643148E+00 | loss scale: 32768.0 | grad norm: 123140.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2606/ 159576 | consumed samples: 41696 | elapsed time per iteration (ms): 13667.3 | learning rate: 1.156E-05 | global batch size: 16 | lm loss: 6.805959E+00 | loss scale: 32768.0 | grad norm: 196566.691 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2607/ 159576 | consumed samples: 41712 | elapsed time per iteration (ms): 13574.2 | learning rate: 1.156E-05 | global batch size: 16 | lm loss: 6.711599E+00 | loss scale: 32768.0 | grad norm: 167578.316 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2608/ 159576 | consumed samples: 41728 | elapsed time per iteration (ms): 13571.4 | learning rate: 1.157E-05 | global batch size: 16 | lm loss: 6.852364E+00 | loss scale: 32768.0 | grad norm: 120545.344 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2609/ 159576 | consumed samples: 41744 | elapsed time per iteration (ms): 13823.4 | learning rate: 1.157E-05 | global batch size: 16 | lm loss: 6.988579E+00 | loss scale: 32768.0 | grad norm: 242130.577 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2610/ 159576 | consumed samples: 41760 | elapsed time per iteration (ms): 13677.8 | learning rate: 1.157E-05 | global batch size: 16 | lm loss: 6.640975E+00 | loss scale: 32768.0 | grad norm: 193270.029 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2611/ 159576 | consumed samples: 41776 | elapsed time per iteration (ms): 13648.9 | learning rate: 1.158E-05 | global batch size: 16 | lm loss: 6.554218E+00 | loss scale: 32768.0 | grad norm: 132307.655 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2612/ 159576 | consumed samples: 41792 | elapsed time per iteration (ms): 13675.5 | learning rate: 1.158E-05 | global batch size: 16 | lm loss: 6.875402E+00 | loss scale: 32768.0 | grad norm: 127017.802 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2613/ 159576 | consumed samples: 41808 | elapsed time per iteration (ms): 13589.6 | learning rate: 1.159E-05 | global batch size: 16 | lm loss: 6.853450E+00 | loss scale: 32768.0 | grad norm: 271835.942 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2614/ 159576 | consumed samples: 41824 | elapsed time per iteration (ms): 13981.2 | learning rate: 1.159E-05 | global batch size: 16 | lm loss: 6.810247E+00 | loss scale: 32768.0 | grad norm: 210644.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2615/ 159576 | consumed samples: 41840 | elapsed time per iteration (ms): 13580.3 | learning rate: 1.160E-05 | global batch size: 16 | lm loss: 6.856892E+00 | loss scale: 32768.0 | grad norm: 139996.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2616/ 159576 | consumed samples: 41856 | elapsed time per iteration (ms): 13592.7 | learning rate: 1.160E-05 | global batch size: 16 | lm loss: 6.687234E+00 | loss scale: 32768.0 | grad norm: 130216.414 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2617/ 159576 | consumed samples: 41872 | elapsed time per iteration (ms): 13579.5 | learning rate: 1.161E-05 | global batch size: 16 | lm loss: 6.753475E+00 | loss scale: 32768.0 | grad norm: 270435.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2618/ 159576 | consumed samples: 41888 | elapsed time per iteration (ms): 14037.5 | learning rate: 1.161E-05 | global batch size: 16 | lm loss: 6.964073E+00 | loss scale: 32768.0 | grad norm: 185416.747 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2619/ 159576 | consumed samples: 41904 | elapsed time per iteration (ms): 13552.1 | learning rate: 1.161E-05 | global batch size: 16 | lm loss: 6.609634E+00 | loss scale: 32768.0 | grad norm: 157098.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2620/ 159576 | consumed samples: 41920 | elapsed time per iteration (ms): 13574.2 | learning rate: 1.162E-05 | global batch size: 16 | lm loss: 7.006974E+00 | loss scale: 32768.0 | grad norm: 140378.271 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2621/ 159576 | consumed samples: 41936 | elapsed time per iteration (ms): 13648.0 | learning rate: 1.162E-05 | global batch size: 16 | lm loss: 6.562167E+00 | loss scale: 32768.0 | grad norm: 169654.536 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2622/ 159576 | consumed samples: 41952 | elapsed time per iteration (ms): 13713.4 | learning rate: 1.163E-05 | global batch size: 16 | lm loss: 6.810758E+00 | loss scale: 32768.0 | grad norm: 209798.087 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2623/ 159576 | consumed samples: 41968 | elapsed time per iteration (ms): 13925.7 | learning rate: 1.163E-05 | global batch size: 16 | lm loss: 6.522465E+00 | loss scale: 32768.0 | grad norm: 119471.106 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2624/ 159576 | consumed samples: 41984 | elapsed time per iteration (ms): 13583.0 | learning rate: 1.164E-05 | global batch size: 16 | lm loss: 6.827784E+00 | loss scale: 32768.0 | grad norm: 115498.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2625/ 159576 | consumed samples: 42000 | elapsed time per iteration (ms): 13618.7 | learning rate: 1.164E-05 | global batch size: 16 | lm loss: 6.663583E+00 | loss scale: 32768.0 | grad norm: 131333.385 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2626/ 159576 | consumed samples: 42016 | elapsed time per iteration (ms): 13695.0 | learning rate: 1.164E-05 | global batch size: 16 | lm loss: 6.731676E+00 | loss scale: 32768.0 | grad norm: 105476.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2627/ 159576 | consumed samples: 42032 | elapsed time per iteration (ms): 14032.3 | learning rate: 1.165E-05 | global batch size: 16 | lm loss: 6.635394E+00 | loss scale: 32768.0 | grad norm: 155841.088 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2628/ 159576 | consumed samples: 42048 | elapsed time per iteration (ms): 13596.4 | learning rate: 1.165E-05 | global batch size: 16 | lm loss: 6.768427E+00 | loss scale: 32768.0 | grad norm: 91352.945 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2629/ 159576 | consumed samples: 42064 | elapsed time per iteration (ms): 13735.4 | learning rate: 1.166E-05 | global batch size: 16 | lm loss: 6.877464E+00 | loss scale: 32768.0 | grad norm: 246645.890 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2630/ 159576 | consumed samples: 42080 | elapsed time per iteration (ms): 13558.6 | learning rate: 1.166E-05 | global batch size: 16 | lm loss: 6.714092E+00 | loss scale: 32768.0 | grad norm: 131077.473 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2631/ 159576 | consumed samples: 42096 | elapsed time per iteration (ms): 14063.2 | learning rate: 1.167E-05 | global batch size: 16 | lm loss: 6.598214E+00 | loss scale: 32768.0 | grad norm: 142113.685 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2632/ 159576 | consumed samples: 42112 | elapsed time per iteration (ms): 13570.0 | learning rate: 1.167E-05 | global batch size: 16 | lm loss: 6.958339E+00 | loss scale: 32768.0 | grad norm: 196255.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2633/ 159576 | consumed samples: 42128 | elapsed time per iteration (ms): 13592.6 | learning rate: 1.168E-05 | global batch size: 16 | lm loss: 6.596231E+00 | loss scale: 32768.0 | grad norm: 167680.420 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2634/ 159576 | consumed samples: 42144 | elapsed time per iteration (ms): 13671.7 | learning rate: 1.168E-05 | global batch size: 16 | lm loss: 6.775526E+00 | loss scale: 32768.0 | grad norm: 111055.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2635/ 159576 | consumed samples: 42160 | elapsed time per iteration (ms): 13642.2 | learning rate: 1.168E-05 | global batch size: 16 | lm loss: 6.786438E+00 | loss scale: 32768.0 | grad norm: 146172.944 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2636/ 159576 | consumed samples: 42176 | elapsed time per iteration (ms): 14001.7 | learning rate: 1.169E-05 | global batch size: 16 | lm loss: 6.785826E+00 | loss scale: 32768.0 | grad norm: 101705.287 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2637/ 159576 | consumed samples: 42192 | elapsed time per iteration (ms): 13632.3 | learning rate: 1.169E-05 | global batch size: 16 | lm loss: 6.918137E+00 | loss scale: 32768.0 | grad norm: 359289.431 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2638/ 159576 | consumed samples: 42208 | elapsed time per iteration (ms): 13642.4 | learning rate: 1.170E-05 | global batch size: 16 | lm loss: 6.474925E+00 | loss scale: 32768.0 | grad norm: 210644.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2639/ 159576 | consumed samples: 42224 | elapsed time per iteration (ms): 13584.1 | learning rate: 1.170E-05 | global batch size: 16 | lm loss: 6.622705E+00 | loss scale: 32768.0 | grad norm: 159853.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2640/ 159576 | consumed samples: 42240 | elapsed time per iteration (ms): 13928.4 | learning rate: 1.171E-05 | global batch size: 16 | lm loss: 6.883276E+00 | loss scale: 32768.0 | grad norm: 134874.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2641/ 159576 | consumed samples: 42256 | elapsed time per iteration (ms): 13672.3 | learning rate: 1.171E-05 | global batch size: 16 | lm loss: 6.975843E+00 | loss scale: 32768.0 | grad norm: 136138.664 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2642/ 159576 | consumed samples: 42272 | elapsed time per iteration (ms): 13705.7 | learning rate: 1.172E-05 | global batch size: 16 | lm loss: 6.698567E+00 | loss scale: 32768.0 | grad norm: 132708.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2643/ 159576 | consumed samples: 42288 | elapsed time per iteration (ms): 13640.4 | learning rate: 1.172E-05 | global batch size: 16 | lm loss: 6.910300E+00 | loss scale: 32768.0 | grad norm: 128937.691 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2644/ 159576 | consumed samples: 42304 | elapsed time per iteration (ms): 13924.6 | learning rate: 1.172E-05 | global batch size: 16 | lm loss: 6.661136E+00 | loss scale: 32768.0 | grad norm: 144385.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2645/ 159576 | consumed samples: 42320 | elapsed time per iteration (ms): 13731.5 | learning rate: 1.173E-05 | global batch size: 16 | lm loss: 6.749330E+00 | loss scale: 32768.0 | grad norm: 136497.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2646/ 159576 | consumed samples: 42336 | elapsed time per iteration (ms): 13631.6 | learning rate: 1.173E-05 | global batch size: 16 | lm loss: 6.774727E+00 | loss scale: 32768.0 | grad norm: 157115.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2647/ 159576 | consumed samples: 42352 | elapsed time per iteration (ms): 13587.3 | learning rate: 1.174E-05 | global batch size: 16 | lm loss: 6.897247E+00 | loss scale: 32768.0 | grad norm: 122884.703 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2648/ 159576 | consumed samples: 42368 | elapsed time per iteration (ms): 13582.9 | learning rate: 1.174E-05 | global batch size: 16 | lm loss: 6.902627E+00 | loss scale: 32768.0 | grad norm: 136617.675 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2649/ 159576 | consumed samples: 42384 | elapsed time per iteration (ms): 14194.1 | learning rate: 1.175E-05 | global batch size: 16 | lm loss: 6.654990E+00 | loss scale: 32768.0 | grad norm: 121668.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2650/ 159576 | consumed samples: 42400 | elapsed time per iteration (ms): 13827.0 | learning rate: 1.175E-05 | global batch size: 16 | lm loss: 6.718140E+00 | loss scale: 32768.0 | grad norm: 94592.966 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2651/ 159576 | consumed samples: 42416 | elapsed time per iteration (ms): 13600.7 | learning rate: 1.176E-05 | global batch size: 16 | lm loss: 6.674122E+00 | loss scale: 32768.0 | grad norm: 105220.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2652/ 159576 | consumed samples: 42432 | elapsed time per iteration (ms): 13643.1 | learning rate: 1.176E-05 | global batch size: 16 | lm loss: 6.662145E+00 | loss scale: 32768.0 | grad norm: 222158.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2653/ 159576 | consumed samples: 42448 | elapsed time per iteration (ms): 13957.5 | learning rate: 1.176E-05 | global batch size: 16 | lm loss: 6.613699E+00 | loss scale: 32768.0 | grad norm: 110830.033 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2654/ 159576 | consumed samples: 42464 | elapsed time per iteration (ms): 13668.1 | learning rate: 1.177E-05 | global batch size: 16 | lm loss: 6.510882E+00 | loss scale: 32768.0 | grad norm: 143615.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2655/ 159576 | consumed samples: 42480 | elapsed time per iteration (ms): 13633.2 | learning rate: 1.177E-05 | global batch size: 16 | lm loss: 6.732093E+00 | loss scale: 32768.0 | grad norm: 159462.660 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2656/ 159576 | consumed samples: 42496 | elapsed time per iteration (ms): 13620.1 | learning rate: 1.178E-05 | global batch size: 16 | lm loss: 6.660037E+00 | loss scale: 32768.0 | grad norm: 244166.739 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2657/ 159576 | consumed samples: 42512 | elapsed time per iteration (ms): 13831.3 | learning rate: 1.178E-05 | global batch size: 16 | lm loss: 6.626472E+00 | loss scale: 32768.0 | grad norm: 149275.048 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2658/ 159576 | consumed samples: 42528 | elapsed time per iteration (ms): 13824.8 | learning rate: 1.179E-05 | global batch size: 16 | lm loss: 6.687421E+00 | loss scale: 32768.0 | grad norm: 139977.063 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2659/ 159576 | consumed samples: 42544 | elapsed time per iteration (ms): 13722.5 | learning rate: 1.179E-05 | global batch size: 16 | lm loss: 6.524724E+00 | loss scale: 32768.0 | grad norm: 106042.464 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2660/ 159576 | consumed samples: 42560 | elapsed time per iteration (ms): 13670.7 | learning rate: 1.180E-05 | global batch size: 16 | lm loss: 6.908322E+00 | loss scale: 32768.0 | grad norm: 201686.670 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2661/ 159576 | consumed samples: 42576 | elapsed time per iteration (ms): 13612.7 | learning rate: 1.180E-05 | global batch size: 16 | lm loss: 6.837928E+00 | loss scale: 32768.0 | grad norm: 126017.738 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2662/ 159576 | consumed samples: 42592 | elapsed time per iteration (ms): 13941.2 | learning rate: 1.180E-05 | global batch size: 16 | lm loss: 6.439098E+00 | loss scale: 32768.0 | grad norm: 160984.308 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2663/ 159576 | consumed samples: 42608 | elapsed time per iteration (ms): 13713.4 | learning rate: 1.181E-05 | global batch size: 16 | lm loss: 6.723923E+00 | loss scale: 32768.0 | grad norm: 139598.213 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2664/ 159576 | consumed samples: 42624 | elapsed time per iteration (ms): 6797.7 | learning rate: 1.181E-05 | global batch size: 16 | lm loss: 7.335284E+00 | loss scale: 32768.0 | grad norm: 139598.213 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2665/ 159576 | consumed samples: 42640 | elapsed time per iteration (ms): 13135.0 | learning rate: 1.181E-05 | global batch size: 16 | lm loss: 6.985713E+00 | loss scale: 32768.0 | grad norm: 180390.498 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2666/ 159576 | consumed samples: 42656 | elapsed time per iteration (ms): 13618.0 | learning rate: 1.182E-05 | global batch size: 16 | lm loss: 6.556298E+00 | loss scale: 32768.0 | grad norm: 144470.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2667/ 159576 | consumed samples: 42672 | elapsed time per iteration (ms): 14126.5 | learning rate: 1.182E-05 | global batch size: 16 | lm loss: 7.063251E+00 | loss scale: 32768.0 | grad norm: 146115.736 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2668/ 159576 | consumed samples: 42688 | elapsed time per iteration (ms): 13677.8 | learning rate: 1.183E-05 | global batch size: 16 | lm loss: 6.846446E+00 | loss scale: 32768.0 | grad norm: 164938.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2669/ 159576 | consumed samples: 42704 | elapsed time per iteration (ms): 13662.5 | learning rate: 1.183E-05 | global batch size: 16 | lm loss: 6.704443E+00 | loss scale: 32768.0 | grad norm: 183338.838 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2670/ 159576 | consumed samples: 42720 | elapsed time per iteration (ms): 13752.8 | learning rate: 1.184E-05 | global batch size: 16 | lm loss: 6.828314E+00 | loss scale: 32768.0 | grad norm: 291659.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2671/ 159576 | consumed samples: 42736 | elapsed time per iteration (ms): 14053.5 | learning rate: 1.184E-05 | global batch size: 16 | lm loss: 6.701608E+00 | loss scale: 32768.0 | grad norm: 137566.756 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2672/ 159576 | consumed samples: 42752 | elapsed time per iteration (ms): 13555.7 | learning rate: 1.184E-05 | global batch size: 16 | lm loss: 6.495778E+00 | loss scale: 32768.0 | grad norm: 140566.748 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2673/ 159576 | consumed samples: 42768 | elapsed time per iteration (ms): 13625.0 | learning rate: 1.185E-05 | global batch size: 16 | lm loss: 6.868438E+00 | loss scale: 32768.0 | grad norm: 137822.671 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2674/ 159576 | consumed samples: 42784 | elapsed time per iteration (ms): 13681.3 | learning rate: 1.185E-05 | global batch size: 16 | lm loss: 6.855990E+00 | loss scale: 32768.0 | grad norm: 217925.291 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2675/ 159576 | consumed samples: 42800 | elapsed time per iteration (ms): 13726.3 | learning rate: 1.186E-05 | global batch size: 16 | lm loss: 6.726338E+00 | loss scale: 32768.0 | grad norm: 169676.723 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2676/ 159576 | consumed samples: 42816 | elapsed time per iteration (ms): 14028.2 | learning rate: 1.186E-05 | global batch size: 16 | lm loss: 6.632861E+00 | loss scale: 32768.0 | grad norm: 146027.824 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2677/ 159576 | consumed samples: 42832 | elapsed time per iteration (ms): 13624.3 | learning rate: 1.187E-05 | global batch size: 16 | lm loss: 6.642831E+00 | loss scale: 32768.0 | grad norm: 163148.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2678/ 159576 | consumed samples: 42848 | elapsed time per iteration (ms): 13717.5 | learning rate: 1.187E-05 | global batch size: 16 | lm loss: 6.689285E+00 | loss scale: 32768.0 | grad norm: 129142.991 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2679/ 159576 | consumed samples: 42864 | elapsed time per iteration (ms): 13575.7 | learning rate: 1.188E-05 | global batch size: 16 | lm loss: 6.577474E+00 | loss scale: 32768.0 | grad norm: 168075.285 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2680/ 159576 | consumed samples: 42880 | elapsed time per iteration (ms): 13990.7 | learning rate: 1.188E-05 | global batch size: 16 | lm loss: 6.806996E+00 | loss scale: 32768.0 | grad norm: 138707.563 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2681/ 159576 | consumed samples: 42896 | elapsed time per iteration (ms): 13614.3 | learning rate: 1.188E-05 | global batch size: 16 | lm loss: 6.616170E+00 | loss scale: 32768.0 | grad norm: 138396.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2682/ 159576 | consumed samples: 42912 | elapsed time per iteration (ms): 13528.4 | learning rate: 1.189E-05 | global batch size: 16 | lm loss: 6.760321E+00 | loss scale: 32768.0 | grad norm: 146622.283 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2683/ 159576 | consumed samples: 42928 | elapsed time per iteration (ms): 13595.4 | learning rate: 1.189E-05 | global batch size: 16 | lm loss: 6.828167E+00 | loss scale: 32768.0 | grad norm: 205452.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2684/ 159576 | consumed samples: 42944 | elapsed time per iteration (ms): 14090.0 | learning rate: 1.190E-05 | global batch size: 16 | lm loss: 6.974781E+00 | loss scale: 32768.0 | grad norm: 141438.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2685/ 159576 | consumed samples: 42960 | elapsed time per iteration (ms): 13490.5 | learning rate: 1.190E-05 | global batch size: 16 | lm loss: 6.720265E+00 | loss scale: 32768.0 | grad norm: 131667.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2686/ 159576 | consumed samples: 42976 | elapsed time per iteration (ms): 13606.4 | learning rate: 1.191E-05 | global batch size: 16 | lm loss: 6.645846E+00 | loss scale: 32768.0 | grad norm: 143915.440 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2687/ 159576 | consumed samples: 42992 | elapsed time per iteration (ms): 13579.9 | learning rate: 1.191E-05 | global batch size: 16 | lm loss: 6.852206E+00 | loss scale: 32768.0 | grad norm: 206032.603 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2688/ 159576 | consumed samples: 43008 | elapsed time per iteration (ms): 13654.7 | learning rate: 1.192E-05 | global batch size: 16 | lm loss: 6.708066E+00 | loss scale: 32768.0 | grad norm: 135547.494 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2689/ 159576 | consumed samples: 43024 | elapsed time per iteration (ms): 13756.9 | learning rate: 1.192E-05 | global batch size: 16 | lm loss: 6.627333E+00 | loss scale: 32768.0 | grad norm: 103806.748 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2690/ 159576 | consumed samples: 43040 | elapsed time per iteration (ms): 13560.8 | learning rate: 1.192E-05 | global batch size: 16 | lm loss: 6.624159E+00 | loss scale: 32768.0 | grad norm: 204724.023 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2691/ 159576 | consumed samples: 43056 | elapsed time per iteration (ms): 13656.6 | learning rate: 1.193E-05 | global batch size: 16 | lm loss: 6.803893E+00 | loss scale: 32768.0 | grad norm: 123248.563 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2692/ 159576 | consumed samples: 43072 | elapsed time per iteration (ms): 13672.9 | learning rate: 1.193E-05 | global batch size: 16 | lm loss: 6.801785E+00 | loss scale: 32768.0 | grad norm: 140785.815 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2693/ 159576 | consumed samples: 43088 | elapsed time per iteration (ms): 14015.4 | learning rate: 1.194E-05 | global batch size: 16 | lm loss: 6.464381E+00 | loss scale: 32768.0 | grad norm: 131615.707 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2694/ 159576 | consumed samples: 43104 | elapsed time per iteration (ms): 13588.1 | learning rate: 1.194E-05 | global batch size: 16 | lm loss: 6.727094E+00 | loss scale: 32768.0 | grad norm: 213544.967 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2695/ 159576 | consumed samples: 43120 | elapsed time per iteration (ms): 13608.1 | learning rate: 1.195E-05 | global batch size: 16 | lm loss: 6.930735E+00 | loss scale: 32768.0 | grad norm: 179180.455 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2696/ 159576 | consumed samples: 43136 | elapsed time per iteration (ms): 13594.8 | learning rate: 1.195E-05 | global batch size: 16 | lm loss: 6.652137E+00 | loss scale: 32768.0 | grad norm: 171091.491 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2697/ 159576 | consumed samples: 43152 | elapsed time per iteration (ms): 13943.3 | learning rate: 1.196E-05 | global batch size: 16 | lm loss: 6.731685E+00 | loss scale: 32768.0 | grad norm: 151811.525 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2698/ 159576 | consumed samples: 43168 | elapsed time per iteration (ms): 13773.1 | learning rate: 1.196E-05 | global batch size: 16 | lm loss: 7.081783E+00 | loss scale: 32768.0 | grad norm: 132367.994 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2699/ 159576 | consumed samples: 43184 | elapsed time per iteration (ms): 13644.6 | learning rate: 1.196E-05 | global batch size: 16 | lm loss: 6.806893E+00 | loss scale: 32768.0 | grad norm: 319459.435 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2700/ 159576 | consumed samples: 43200 | elapsed time per iteration (ms): 13698.5 | learning rate: 1.197E-05 | global batch size: 16 | lm loss: 6.666497E+00 | loss scale: 32768.0 | grad norm: 120927.371 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2701/ 159576 | consumed samples: 43216 | elapsed time per iteration (ms): 13684.8 | learning rate: 1.197E-05 | global batch size: 16 | lm loss: 6.701412E+00 | loss scale: 32768.0 | grad norm: 150633.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2702/ 159576 | consumed samples: 43232 | elapsed time per iteration (ms): 13780.3 | learning rate: 1.198E-05 | global batch size: 16 | lm loss: 6.594296E+00 | loss scale: 32768.0 | grad norm: 161110.656 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2703/ 159576 | consumed samples: 43248 | elapsed time per iteration (ms): 13593.9 | learning rate: 1.198E-05 | global batch size: 16 | lm loss: 6.808178E+00 | loss scale: 32768.0 | grad norm: 258358.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2704/ 159576 | consumed samples: 43264 | elapsed time per iteration (ms): 13635.4 | learning rate: 1.199E-05 | global batch size: 16 | lm loss: 6.815506E+00 | loss scale: 32768.0 | grad norm: 183028.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2705/ 159576 | consumed samples: 43280 | elapsed time per iteration (ms): 13605.1 | learning rate: 1.199E-05 | global batch size: 16 | lm loss: 6.967249E+00 | loss scale: 32768.0 | grad norm: 243583.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2706/ 159576 | consumed samples: 43296 | elapsed time per iteration (ms): 14130.1 | learning rate: 1.200E-05 | global batch size: 16 | lm loss: 7.062543E+00 | loss scale: 32768.0 | grad norm: 207737.438 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2707/ 159576 | consumed samples: 43312 | elapsed time per iteration (ms): 13561.8 | learning rate: 1.200E-05 | global batch size: 16 | lm loss: 6.758321E+00 | loss scale: 32768.0 | grad norm: 146527.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2708/ 159576 | consumed samples: 43328 | elapsed time per iteration (ms): 13722.0 | learning rate: 1.200E-05 | global batch size: 16 | lm loss: 6.584868E+00 | loss scale: 32768.0 | grad norm: 272015.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2709/ 159576 | consumed samples: 43344 | elapsed time per iteration (ms): 13654.1 | learning rate: 1.201E-05 | global batch size: 16 | lm loss: 6.709559E+00 | loss scale: 32768.0 | grad norm: 284012.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2710/ 159576 | consumed samples: 43360 | elapsed time per iteration (ms): 13595.7 | learning rate: 1.201E-05 | global batch size: 16 | lm loss: 6.830414E+00 | loss scale: 32768.0 | grad norm: 149403.503 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2711/ 159576 | consumed samples: 43376 | elapsed time per iteration (ms): 13973.4 | learning rate: 1.202E-05 | global batch size: 16 | lm loss: 6.624958E+00 | loss scale: 32768.0 | grad norm: 146777.014 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2712/ 159576 | consumed samples: 43392 | elapsed time per iteration (ms): 13700.0 | learning rate: 1.202E-05 | global batch size: 16 | lm loss: 6.735670E+00 | loss scale: 32768.0 | grad norm: 136631.989 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2713/ 159576 | consumed samples: 43408 | elapsed time per iteration (ms): 13572.3 | learning rate: 1.203E-05 | global batch size: 16 | lm loss: 6.765169E+00 | loss scale: 32768.0 | grad norm: 280479.328 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2714/ 159576 | consumed samples: 43424 | elapsed time per iteration (ms): 13642.4 | learning rate: 1.203E-05 | global batch size: 16 | lm loss: 6.622662E+00 | loss scale: 32768.0 | grad norm: 160875.579 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2715/ 159576 | consumed samples: 43440 | elapsed time per iteration (ms): 14122.3 | learning rate: 1.204E-05 | global batch size: 16 | lm loss: 6.730956E+00 | loss scale: 32768.0 | grad norm: 206409.146 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2716/ 159576 | consumed samples: 43456 | elapsed time per iteration (ms): 13831.1 | learning rate: 1.204E-05 | global batch size: 16 | lm loss: 6.767645E+00 | loss scale: 32768.0 | grad norm: 149352.449 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2717/ 159576 | consumed samples: 43472 | elapsed time per iteration (ms): 13572.9 | learning rate: 1.204E-05 | global batch size: 16 | lm loss: 6.975914E+00 | loss scale: 32768.0 | grad norm: 119850.584 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2718/ 159576 | consumed samples: 43488 | elapsed time per iteration (ms): 13686.9 | learning rate: 1.205E-05 | global batch size: 16 | lm loss: 6.919794E+00 | loss scale: 32768.0 | grad norm: 172348.990 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2719/ 159576 | consumed samples: 43504 | elapsed time per iteration (ms): 13976.8 | learning rate: 1.205E-05 | global batch size: 16 | lm loss: 6.652202E+00 | loss scale: 32768.0 | grad norm: 178184.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2720/ 159576 | consumed samples: 43520 | elapsed time per iteration (ms): 13571.8 | learning rate: 1.206E-05 | global batch size: 16 | lm loss: 6.787558E+00 | loss scale: 32768.0 | grad norm: 130225.615 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2721/ 159576 | consumed samples: 43536 | elapsed time per iteration (ms): 13693.7 | learning rate: 1.206E-05 | global batch size: 16 | lm loss: 6.660249E+00 | loss scale: 32768.0 | grad norm: 144428.996 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2722/ 159576 | consumed samples: 43552 | elapsed time per iteration (ms): 13646.9 | learning rate: 1.207E-05 | global batch size: 16 | lm loss: 6.661267E+00 | loss scale: 32768.0 | grad norm: 121995.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2723/ 159576 | consumed samples: 43568 | elapsed time per iteration (ms): 13718.1 | learning rate: 1.207E-05 | global batch size: 16 | lm loss: 6.702977E+00 | loss scale: 32768.0 | grad norm: 205375.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2724/ 159576 | consumed samples: 43584 | elapsed time per iteration (ms): 14072.2 | learning rate: 1.208E-05 | global batch size: 16 | lm loss: 6.859900E+00 | loss scale: 32768.0 | grad norm: 174185.553 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2725/ 159576 | consumed samples: 43600 | elapsed time per iteration (ms): 13643.1 | learning rate: 1.208E-05 | global batch size: 16 | lm loss: 6.642687E+00 | loss scale: 32768.0 | grad norm: 124356.151 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2726/ 159576 | consumed samples: 43616 | elapsed time per iteration (ms): 13637.6 | learning rate: 1.208E-05 | global batch size: 16 | lm loss: 6.849540E+00 | loss scale: 32768.0 | grad norm: 187912.708 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2727/ 159576 | consumed samples: 43632 | elapsed time per iteration (ms): 13570.5 | learning rate: 1.209E-05 | global batch size: 16 | lm loss: 6.505477E+00 | loss scale: 32768.0 | grad norm: 146429.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2728/ 159576 | consumed samples: 43648 | elapsed time per iteration (ms): 14179.1 | learning rate: 1.209E-05 | global batch size: 16 | lm loss: 6.763928E+00 | loss scale: 32768.0 | grad norm: 143016.379 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2729/ 159576 | consumed samples: 43664 | elapsed time per iteration (ms): 13666.5 | learning rate: 1.210E-05 | global batch size: 16 | lm loss: 6.746594E+00 | loss scale: 32768.0 | grad norm: 184649.070 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2730/ 159576 | consumed samples: 43680 | elapsed time per iteration (ms): 13666.9 | learning rate: 1.210E-05 | global batch size: 16 | lm loss: 6.822509E+00 | loss scale: 32768.0 | grad norm: 258599.749 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2731/ 159576 | consumed samples: 43696 | elapsed time per iteration (ms): 13722.5 | learning rate: 1.211E-05 | global batch size: 16 | lm loss: 6.726813E+00 | loss scale: 32768.0 | grad norm: 135253.086 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2732/ 159576 | consumed samples: 43712 | elapsed time per iteration (ms): 14110.6 | learning rate: 1.211E-05 | global batch size: 16 | lm loss: 6.642574E+00 | loss scale: 32768.0 | grad norm: 187051.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2733/ 159576 | consumed samples: 43728 | elapsed time per iteration (ms): 13665.7 | learning rate: 1.212E-05 | global batch size: 16 | lm loss: 6.608624E+00 | loss scale: 32768.0 | grad norm: 164163.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2734/ 159576 | consumed samples: 43744 | elapsed time per iteration (ms): 13624.6 | learning rate: 1.212E-05 | global batch size: 16 | lm loss: 6.755674E+00 | loss scale: 32768.0 | grad norm: 129230.586 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2735/ 159576 | consumed samples: 43760 | elapsed time per iteration (ms): 13617.1 | learning rate: 1.212E-05 | global batch size: 16 | lm loss: 6.771841E+00 | loss scale: 32768.0 | grad norm: 254766.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2736/ 159576 | consumed samples: 43776 | elapsed time per iteration (ms): 13675.3 | learning rate: 1.213E-05 | global batch size: 16 | lm loss: 6.677852E+00 | loss scale: 32768.0 | grad norm: 142644.144 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2737/ 159576 | consumed samples: 43792 | elapsed time per iteration (ms): 13983.3 | learning rate: 1.213E-05 | global batch size: 16 | lm loss: 6.719501E+00 | loss scale: 32768.0 | grad norm: 164953.828 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2738/ 159576 | consumed samples: 43808 | elapsed time per iteration (ms): 13774.1 | learning rate: 1.214E-05 | global batch size: 16 | lm loss: 6.637510E+00 | loss scale: 32768.0 | grad norm: 161949.402 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2739/ 159576 | consumed samples: 43824 | elapsed time per iteration (ms): 13780.8 | learning rate: 1.214E-05 | global batch size: 16 | lm loss: 6.670253E+00 | loss scale: 32768.0 | grad norm: 132053.899 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2740/ 159576 | consumed samples: 43840 | elapsed time per iteration (ms): 13656.5 | learning rate: 1.215E-05 | global batch size: 16 | lm loss: 6.701370E+00 | loss scale: 32768.0 | grad norm: 158609.635 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2741/ 159576 | consumed samples: 43856 | elapsed time per iteration (ms): 13970.4 | learning rate: 1.215E-05 | global batch size: 16 | lm loss: 6.676120E+00 | loss scale: 32768.0 | grad norm: 133079.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2742/ 159576 | consumed samples: 43872 | elapsed time per iteration (ms): 13572.9 | learning rate: 1.216E-05 | global batch size: 16 | lm loss: 6.666083E+00 | loss scale: 32768.0 | grad norm: 121076.330 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2743/ 159576 | consumed samples: 43888 | elapsed time per iteration (ms): 13635.9 | learning rate: 1.216E-05 | global batch size: 16 | lm loss: 6.594894E+00 | loss scale: 32768.0 | grad norm: 206897.979 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2744/ 159576 | consumed samples: 43904 | elapsed time per iteration (ms): 13681.8 | learning rate: 1.216E-05 | global batch size: 16 | lm loss: 6.700480E+00 | loss scale: 32768.0 | grad norm: 126037.905 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2745/ 159576 | consumed samples: 43920 | elapsed time per iteration (ms): 13966.9 | learning rate: 1.217E-05 | global batch size: 16 | lm loss: 6.708483E+00 | loss scale: 32768.0 | grad norm: 136172.741 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2746/ 159576 | consumed samples: 43936 | elapsed time per iteration (ms): 13758.4 | learning rate: 1.217E-05 | global batch size: 16 | lm loss: 6.629419E+00 | loss scale: 32768.0 | grad norm: 142570.267 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2747/ 159576 | consumed samples: 43952 | elapsed time per iteration (ms): 13668.5 | learning rate: 1.218E-05 | global batch size: 16 | lm loss: 6.597517E+00 | loss scale: 32768.0 | grad norm: 155237.223 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2748/ 159576 | consumed samples: 43968 | elapsed time per iteration (ms): 13633.2 | learning rate: 1.218E-05 | global batch size: 16 | lm loss: 6.561327E+00 | loss scale: 32768.0 | grad norm: 162642.892 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2749/ 159576 | consumed samples: 43984 | elapsed time per iteration (ms): 13608.4 | learning rate: 1.219E-05 | global batch size: 16 | lm loss: 6.677460E+00 | loss scale: 32768.0 | grad norm: 192650.212 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2750/ 159576 | consumed samples: 44000 | elapsed time per iteration (ms): 13886.7 | learning rate: 1.219E-05 | global batch size: 16 | lm loss: 6.649335E+00 | loss scale: 32768.0 | grad norm: 171673.975 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2751/ 159576 | consumed samples: 44016 | elapsed time per iteration (ms): 13671.6 | learning rate: 1.220E-05 | global batch size: 16 | lm loss: 6.735415E+00 | loss scale: 32768.0 | grad norm: 128822.354 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2752/ 159576 | consumed samples: 44032 | elapsed time per iteration (ms): 13708.1 | learning rate: 1.220E-05 | global batch size: 16 | lm loss: 6.679979E+00 | loss scale: 32768.0 | grad norm: 253310.737 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2753/ 159576 | consumed samples: 44048 | elapsed time per iteration (ms): 13770.7 | learning rate: 1.220E-05 | global batch size: 16 | lm loss: 6.565764E+00 | loss scale: 32768.0 | grad norm: 116179.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2754/ 159576 | consumed samples: 44064 | elapsed time per iteration (ms): 14066.6 | learning rate: 1.221E-05 | global batch size: 16 | lm loss: 6.742185E+00 | loss scale: 32768.0 | grad norm: 141403.598 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2755/ 159576 | consumed samples: 44080 | elapsed time per iteration (ms): 13651.8 | learning rate: 1.221E-05 | global batch size: 16 | lm loss: 6.762599E+00 | loss scale: 32768.0 | grad norm: 111172.995 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2756/ 159576 | consumed samples: 44096 | elapsed time per iteration (ms): 13694.5 | learning rate: 1.222E-05 | global batch size: 16 | lm loss: 6.733878E+00 | loss scale: 32768.0 | grad norm: 128168.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2757/ 159576 | consumed samples: 44112 | elapsed time per iteration (ms): 13604.8 | learning rate: 1.222E-05 | global batch size: 16 | lm loss: 6.588708E+00 | loss scale: 32768.0 | grad norm: 103022.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2758/ 159576 | consumed samples: 44128 | elapsed time per iteration (ms): 13653.9 | learning rate: 1.223E-05 | global batch size: 16 | lm loss: 6.562719E+00 | loss scale: 32768.0 | grad norm: 138192.892 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2759/ 159576 | consumed samples: 44144 | elapsed time per iteration (ms): 13986.1 | learning rate: 1.223E-05 | global batch size: 16 | lm loss: 6.738625E+00 | loss scale: 32768.0 | grad norm: 121839.165 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2760/ 159576 | consumed samples: 44160 | elapsed time per iteration (ms): 13725.3 | learning rate: 1.224E-05 | global batch size: 16 | lm loss: 6.566117E+00 | loss scale: 32768.0 | grad norm: 104901.052 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2761/ 159576 | consumed samples: 44176 | elapsed time per iteration (ms): 13770.1 | learning rate: 1.224E-05 | global batch size: 16 | lm loss: 6.666871E+00 | loss scale: 32768.0 | grad norm: 123398.519 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2762/ 159576 | consumed samples: 44192 | elapsed time per iteration (ms): 13627.5 | learning rate: 1.224E-05 | global batch size: 16 | lm loss: 6.835371E+00 | loss scale: 32768.0 | grad norm: 112214.547 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2763/ 159576 | consumed samples: 44208 | elapsed time per iteration (ms): 14068.3 | learning rate: 1.225E-05 | global batch size: 16 | lm loss: 6.804303E+00 | loss scale: 32768.0 | grad norm: 122506.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2764/ 159576 | consumed samples: 44224 | elapsed time per iteration (ms): 6917.6 | learning rate: 1.225E-05 | global batch size: 16 | lm loss: 6.972560E+00 | loss scale: 16384.0 | grad norm: 122506.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2765/ 159576 | consumed samples: 44240 | elapsed time per iteration (ms): 13181.9 | learning rate: 1.225E-05 | global batch size: 16 | lm loss: 6.580292E+00 | loss scale: 16384.0 | grad norm: 59992.079 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2766/ 159576 | consumed samples: 44256 | elapsed time per iteration (ms): 13680.1 | learning rate: 1.226E-05 | global batch size: 16 | lm loss: 6.724333E+00 | loss scale: 16384.0 | grad norm: 77015.113 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2767/ 159576 | consumed samples: 44272 | elapsed time per iteration (ms): 13716.6 | learning rate: 1.226E-05 | global batch size: 16 | lm loss: 6.933354E+00 | loss scale: 16384.0 | grad norm: 85522.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2768/ 159576 | consumed samples: 44288 | elapsed time per iteration (ms): 13994.0 | learning rate: 1.227E-05 | global batch size: 16 | lm loss: 6.648163E+00 | loss scale: 16384.0 | grad norm: 58295.975 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2769/ 159576 | consumed samples: 44304 | elapsed time per iteration (ms): 13658.9 | learning rate: 1.227E-05 | global batch size: 16 | lm loss: 6.891530E+00 | loss scale: 16384.0 | grad norm: 75446.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2770/ 159576 | consumed samples: 44320 | elapsed time per iteration (ms): 13703.7 | learning rate: 1.228E-05 | global batch size: 16 | lm loss: 6.591332E+00 | loss scale: 16384.0 | grad norm: 59290.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2771/ 159576 | consumed samples: 44336 | elapsed time per iteration (ms): 13716.9 | learning rate: 1.228E-05 | global batch size: 16 | lm loss: 6.737020E+00 | loss scale: 16384.0 | grad norm: 51929.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2772/ 159576 | consumed samples: 44352 | elapsed time per iteration (ms): 14010.7 | learning rate: 1.228E-05 | global batch size: 16 | lm loss: 6.565439E+00 | loss scale: 16384.0 | grad norm: 100304.309 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2773/ 159576 | consumed samples: 44368 | elapsed time per iteration (ms): 13566.2 | learning rate: 1.229E-05 | global batch size: 16 | lm loss: 6.887408E+00 | loss scale: 16384.0 | grad norm: 86699.024 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2774/ 159576 | consumed samples: 44384 | elapsed time per iteration (ms): 13639.1 | learning rate: 1.229E-05 | global batch size: 16 | lm loss: 6.766156E+00 | loss scale: 16384.0 | grad norm: 64840.948 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2775/ 159576 | consumed samples: 44400 | elapsed time per iteration (ms): 13646.1 | learning rate: 1.230E-05 | global batch size: 16 | lm loss: 6.640082E+00 | loss scale: 16384.0 | grad norm: 61943.696 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2776/ 159576 | consumed samples: 44416 | elapsed time per iteration (ms): 13670.4 | learning rate: 1.230E-05 | global batch size: 16 | lm loss: 6.784959E+00 | loss scale: 16384.0 | grad norm: 68978.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2777/ 159576 | consumed samples: 44432 | elapsed time per iteration (ms): 14012.8 | learning rate: 1.231E-05 | global batch size: 16 | lm loss: 6.670368E+00 | loss scale: 16384.0 | grad norm: 58668.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2778/ 159576 | consumed samples: 44448 | elapsed time per iteration (ms): 13651.5 | learning rate: 1.231E-05 | global batch size: 16 | lm loss: 6.849538E+00 | loss scale: 16384.0 | grad norm: 53539.454 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2779/ 159576 | consumed samples: 44464 | elapsed time per iteration (ms): 13531.1 | learning rate: 1.232E-05 | global batch size: 16 | lm loss: 6.710807E+00 | loss scale: 16384.0 | grad norm: 58047.417 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2780/ 159576 | consumed samples: 44480 | elapsed time per iteration (ms): 13601.2 | learning rate: 1.232E-05 | global batch size: 16 | lm loss: 6.803576E+00 | loss scale: 16384.0 | grad norm: 61014.969 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2781/ 159576 | consumed samples: 44496 | elapsed time per iteration (ms): 14011.6 | learning rate: 1.232E-05 | global batch size: 16 | lm loss: 6.435648E+00 | loss scale: 16384.0 | grad norm: 72928.257 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2782/ 159576 | consumed samples: 44512 | elapsed time per iteration (ms): 13706.9 | learning rate: 1.233E-05 | global batch size: 16 | lm loss: 6.689322E+00 | loss scale: 16384.0 | grad norm: 45124.285 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2783/ 159576 | consumed samples: 44528 | elapsed time per iteration (ms): 13638.0 | learning rate: 1.233E-05 | global batch size: 16 | lm loss: 6.796506E+00 | loss scale: 16384.0 | grad norm: 61254.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2784/ 159576 | consumed samples: 44544 | elapsed time per iteration (ms): 13617.3 | learning rate: 1.234E-05 | global batch size: 16 | lm loss: 6.726316E+00 | loss scale: 16384.0 | grad norm: 58102.179 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2785/ 159576 | consumed samples: 44560 | elapsed time per iteration (ms): 13946.8 | learning rate: 1.234E-05 | global batch size: 16 | lm loss: 6.648038E+00 | loss scale: 16384.0 | grad norm: 68282.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2786/ 159576 | consumed samples: 44576 | elapsed time per iteration (ms): 13594.9 | learning rate: 1.235E-05 | global batch size: 16 | lm loss: 6.860110E+00 | loss scale: 16384.0 | grad norm: 70475.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2787/ 159576 | consumed samples: 44592 | elapsed time per iteration (ms): 13607.8 | learning rate: 1.235E-05 | global batch size: 16 | lm loss: 6.821939E+00 | loss scale: 16384.0 | grad norm: 56499.351 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2788/ 159576 | consumed samples: 44608 | elapsed time per iteration (ms): 13592.1 | learning rate: 1.236E-05 | global batch size: 16 | lm loss: 6.702363E+00 | loss scale: 16384.0 | grad norm: 71878.494 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2789/ 159576 | consumed samples: 44624 | elapsed time per iteration (ms): 13633.0 | learning rate: 1.236E-05 | global batch size: 16 | lm loss: 6.596258E+00 | loss scale: 16384.0 | grad norm: 57167.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2790/ 159576 | consumed samples: 44640 | elapsed time per iteration (ms): 13806.2 | learning rate: 1.236E-05 | global batch size: 16 | lm loss: 6.742100E+00 | loss scale: 16384.0 | grad norm: 78591.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2791/ 159576 | consumed samples: 44656 | elapsed time per iteration (ms): 13659.4 | learning rate: 1.237E-05 | global batch size: 16 | lm loss: 6.602869E+00 | loss scale: 16384.0 | grad norm: 68726.337 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2792/ 159576 | consumed samples: 44672 | elapsed time per iteration (ms): 13592.2 | learning rate: 1.237E-05 | global batch size: 16 | lm loss: 6.708993E+00 | loss scale: 16384.0 | grad norm: 98214.491 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2793/ 159576 | consumed samples: 44688 | elapsed time per iteration (ms): 13507.3 | learning rate: 1.238E-05 | global batch size: 16 | lm loss: 6.616965E+00 | loss scale: 16384.0 | grad norm: 72150.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2794/ 159576 | consumed samples: 44704 | elapsed time per iteration (ms): 13955.1 | learning rate: 1.238E-05 | global batch size: 16 | lm loss: 6.607640E+00 | loss scale: 16384.0 | grad norm: 62728.696 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2795/ 159576 | consumed samples: 44720 | elapsed time per iteration (ms): 13531.1 | learning rate: 1.239E-05 | global batch size: 16 | lm loss: 6.875388E+00 | loss scale: 16384.0 | grad norm: 94768.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2796/ 159576 | consumed samples: 44736 | elapsed time per iteration (ms): 13614.2 | learning rate: 1.239E-05 | global batch size: 16 | lm loss: 6.827682E+00 | loss scale: 16384.0 | grad norm: 59818.476 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2797/ 159576 | consumed samples: 44752 | elapsed time per iteration (ms): 13620.6 | learning rate: 1.239E-05 | global batch size: 16 | lm loss: 6.522869E+00 | loss scale: 16384.0 | grad norm: 74009.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2798/ 159576 | consumed samples: 44768 | elapsed time per iteration (ms): 13985.4 | learning rate: 1.240E-05 | global batch size: 16 | lm loss: 6.654684E+00 | loss scale: 16384.0 | grad norm: 54913.035 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2799/ 159576 | consumed samples: 44784 | elapsed time per iteration (ms): 13759.4 | learning rate: 1.240E-05 | global batch size: 16 | lm loss: 6.544140E+00 | loss scale: 16384.0 | grad norm: 83654.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2800/ 159576 | consumed samples: 44800 | elapsed time per iteration (ms): 13524.0 | learning rate: 1.241E-05 | global batch size: 16 | lm loss: 6.798269E+00 | loss scale: 16384.0 | grad norm: 80678.341 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2801/ 159576 | consumed samples: 44816 | elapsed time per iteration (ms): 13646.5 | learning rate: 1.241E-05 | global batch size: 16 | lm loss: 6.872281E+00 | loss scale: 16384.0 | grad norm: 49084.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2802/ 159576 | consumed samples: 44832 | elapsed time per iteration (ms): 13614.0 | learning rate: 1.242E-05 | global batch size: 16 | lm loss: 6.733764E+00 | loss scale: 16384.0 | grad norm: 88585.751 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2803/ 159576 | consumed samples: 44848 | elapsed time per iteration (ms): 13792.4 | learning rate: 1.242E-05 | global batch size: 16 | lm loss: 6.865559E+00 | loss scale: 16384.0 | grad norm: 48186.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2804/ 159576 | consumed samples: 44864 | elapsed time per iteration (ms): 13655.0 | learning rate: 1.243E-05 | global batch size: 16 | lm loss: 6.631515E+00 | loss scale: 16384.0 | grad norm: 66281.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2805/ 159576 | consumed samples: 44880 | elapsed time per iteration (ms): 13605.4 | learning rate: 1.243E-05 | global batch size: 16 | lm loss: 6.593436E+00 | loss scale: 16384.0 | grad norm: 66274.800 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2806/ 159576 | consumed samples: 44896 | elapsed time per iteration (ms): 13611.6 | learning rate: 1.243E-05 | global batch size: 16 | lm loss: 6.692297E+00 | loss scale: 16384.0 | grad norm: 66535.812 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2807/ 159576 | consumed samples: 44912 | elapsed time per iteration (ms): 13924.4 | learning rate: 1.244E-05 | global batch size: 16 | lm loss: 6.564488E+00 | loss scale: 16384.0 | grad norm: 62289.026 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2808/ 159576 | consumed samples: 44928 | elapsed time per iteration (ms): 13559.5 | learning rate: 1.244E-05 | global batch size: 16 | lm loss: 6.775381E+00 | loss scale: 16384.0 | grad norm: 51114.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2809/ 159576 | consumed samples: 44944 | elapsed time per iteration (ms): 13579.6 | learning rate: 1.245E-05 | global batch size: 16 | lm loss: 6.854599E+00 | loss scale: 16384.0 | grad norm: 78574.479 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2810/ 159576 | consumed samples: 44960 | elapsed time per iteration (ms): 13568.8 | learning rate: 1.245E-05 | global batch size: 16 | lm loss: 6.641658E+00 | loss scale: 16384.0 | grad norm: 48054.399 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2811/ 159576 | consumed samples: 44976 | elapsed time per iteration (ms): 13577.2 | learning rate: 1.246E-05 | global batch size: 16 | lm loss: 6.804714E+00 | loss scale: 16384.0 | grad norm: 85293.239 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2812/ 159576 | consumed samples: 44992 | elapsed time per iteration (ms): 13780.4 | learning rate: 1.246E-05 | global batch size: 16 | lm loss: 6.484572E+00 | loss scale: 16384.0 | grad norm: 54599.094 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2813/ 159576 | consumed samples: 45008 | elapsed time per iteration (ms): 13630.2 | learning rate: 1.247E-05 | global batch size: 16 | lm loss: 6.495656E+00 | loss scale: 16384.0 | grad norm: 131722.081 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2814/ 159576 | consumed samples: 45024 | elapsed time per iteration (ms): 13626.8 | learning rate: 1.247E-05 | global batch size: 16 | lm loss: 6.894939E+00 | loss scale: 16384.0 | grad norm: 102881.431 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2815/ 159576 | consumed samples: 45040 | elapsed time per iteration (ms): 13599.0 | learning rate: 1.247E-05 | global batch size: 16 | lm loss: 6.883965E+00 | loss scale: 16384.0 | grad norm: 72100.325 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2816/ 159576 | consumed samples: 45056 | elapsed time per iteration (ms): 14052.1 | learning rate: 1.248E-05 | global batch size: 16 | lm loss: 6.573022E+00 | loss scale: 16384.0 | grad norm: 72968.507 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2817/ 159576 | consumed samples: 45072 | elapsed time per iteration (ms): 13541.1 | learning rate: 1.248E-05 | global batch size: 16 | lm loss: 6.646833E+00 | loss scale: 16384.0 | grad norm: 90510.016 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2818/ 159576 | consumed samples: 45088 | elapsed time per iteration (ms): 13597.7 | learning rate: 1.249E-05 | global batch size: 16 | lm loss: 6.898618E+00 | loss scale: 16384.0 | grad norm: 90037.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2819/ 159576 | consumed samples: 45104 | elapsed time per iteration (ms): 13575.0 | learning rate: 1.249E-05 | global batch size: 16 | lm loss: 6.547668E+00 | loss scale: 16384.0 | grad norm: 79277.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2820/ 159576 | consumed samples: 45120 | elapsed time per iteration (ms): 14016.3 | learning rate: 1.250E-05 | global batch size: 16 | lm loss: 6.791230E+00 | loss scale: 16384.0 | grad norm: 63437.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2821/ 159576 | consumed samples: 45136 | elapsed time per iteration (ms): 13565.5 | learning rate: 1.250E-05 | global batch size: 16 | lm loss: 6.957808E+00 | loss scale: 16384.0 | grad norm: 56738.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2822/ 159576 | consumed samples: 45152 | elapsed time per iteration (ms): 13564.0 | learning rate: 1.251E-05 | global batch size: 16 | lm loss: 6.729958E+00 | loss scale: 16384.0 | grad norm: 93778.013 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2823/ 159576 | consumed samples: 45168 | elapsed time per iteration (ms): 13650.0 | learning rate: 1.251E-05 | global batch size: 16 | lm loss: 6.480144E+00 | loss scale: 16384.0 | grad norm: 60246.483 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2824/ 159576 | consumed samples: 45184 | elapsed time per iteration (ms): 13511.5 | learning rate: 1.251E-05 | global batch size: 16 | lm loss: 6.595847E+00 | loss scale: 16384.0 | grad norm: 63557.308 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2825/ 159576 | consumed samples: 45200 | elapsed time per iteration (ms): 13655.5 | learning rate: 1.252E-05 | global batch size: 16 | lm loss: 6.689149E+00 | loss scale: 16384.0 | grad norm: 67372.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2826/ 159576 | consumed samples: 45216 | elapsed time per iteration (ms): 13638.0 | learning rate: 1.252E-05 | global batch size: 16 | lm loss: 6.689507E+00 | loss scale: 16384.0 | grad norm: 69124.069 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2827/ 159576 | consumed samples: 45232 | elapsed time per iteration (ms): 13546.1 | learning rate: 1.253E-05 | global batch size: 16 | lm loss: 6.457958E+00 | loss scale: 16384.0 | grad norm: 56160.018 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2828/ 159576 | consumed samples: 45248 | elapsed time per iteration (ms): 13610.9 | learning rate: 1.253E-05 | global batch size: 16 | lm loss: 6.815155E+00 | loss scale: 16384.0 | grad norm: 61009.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2829/ 159576 | consumed samples: 45264 | elapsed time per iteration (ms): 13930.1 | learning rate: 1.254E-05 | global batch size: 16 | lm loss: 6.595886E+00 | loss scale: 16384.0 | grad norm: 45906.579 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2830/ 159576 | consumed samples: 45280 | elapsed time per iteration (ms): 13608.1 | learning rate: 1.254E-05 | global batch size: 16 | lm loss: 6.642846E+00 | loss scale: 16384.0 | grad norm: 74796.317 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2831/ 159576 | consumed samples: 45296 | elapsed time per iteration (ms): 13539.5 | learning rate: 1.255E-05 | global batch size: 16 | lm loss: 6.810493E+00 | loss scale: 16384.0 | grad norm: 64536.090 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2832/ 159576 | consumed samples: 45312 | elapsed time per iteration (ms): 13571.9 | learning rate: 1.255E-05 | global batch size: 16 | lm loss: 6.742997E+00 | loss scale: 16384.0 | grad norm: 54697.232 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2833/ 159576 | consumed samples: 45328 | elapsed time per iteration (ms): 13626.3 | learning rate: 1.255E-05 | global batch size: 16 | lm loss: 6.734198E+00 | loss scale: 16384.0 | grad norm: 124253.673 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2834/ 159576 | consumed samples: 45344 | elapsed time per iteration (ms): 13905.8 | learning rate: 1.256E-05 | global batch size: 16 | lm loss: 6.652834E+00 | loss scale: 16384.0 | grad norm: 71659.235 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 13:06:39] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 13:06:39] PULSE: tr8-104B is running for 7:14:28 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 2835/ 159576 | consumed samples: 45360 | elapsed time per iteration (ms): 13609.5 | learning rate: 1.256E-05 | global batch size: 16 | lm loss: 6.789959E+00 | loss scale: 16384.0 | grad norm: 73488.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2836/ 159576 | consumed samples: 45376 | elapsed time per iteration (ms): 13614.7 | learning rate: 1.257E-05 | global batch size: 16 | lm loss: 6.695529E+00 | loss scale: 16384.0 | grad norm: 69307.839 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2837/ 159576 | consumed samples: 45392 | elapsed time per iteration (ms): 13634.1 | learning rate: 1.257E-05 | global batch size: 16 | lm loss: 6.550642E+00 | loss scale: 16384.0 | grad norm: 88157.717 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2838/ 159576 | consumed samples: 45408 | elapsed time per iteration (ms): 14029.3 | learning rate: 1.258E-05 | global batch size: 16 | lm loss: 6.745864E+00 | loss scale: 16384.0 | grad norm: 79032.308 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2839/ 159576 | consumed samples: 45424 | elapsed time per iteration (ms): 13631.7 | learning rate: 1.258E-05 | global batch size: 16 | lm loss: 7.013217E+00 | loss scale: 16384.0 | grad norm: 90598.683 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2840/ 159576 | consumed samples: 45440 | elapsed time per iteration (ms): 13552.2 | learning rate: 1.259E-05 | global batch size: 16 | lm loss: 6.791473E+00 | loss scale: 16384.0 | grad norm: 66761.526 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2841/ 159576 | consumed samples: 45456 | elapsed time per iteration (ms): 13585.4 | learning rate: 1.259E-05 | global batch size: 16 | lm loss: 6.639102E+00 | loss scale: 16384.0 | grad norm: 75945.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2842/ 159576 | consumed samples: 45472 | elapsed time per iteration (ms): 14005.5 | learning rate: 1.259E-05 | global batch size: 16 | lm loss: 6.750570E+00 | loss scale: 16384.0 | grad norm: 52422.045 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2843/ 159576 | consumed samples: 45488 | elapsed time per iteration (ms): 13637.6 | learning rate: 1.260E-05 | global batch size: 16 | lm loss: 6.761233E+00 | loss scale: 16384.0 | grad norm: 96201.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2844/ 159576 | consumed samples: 45504 | elapsed time per iteration (ms): 13605.0 | learning rate: 1.260E-05 | global batch size: 16 | lm loss: 6.869712E+00 | loss scale: 16384.0 | grad norm: 85259.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2845/ 159576 | consumed samples: 45520 | elapsed time per iteration (ms): 13489.6 | learning rate: 1.261E-05 | global batch size: 16 | lm loss: 6.754227E+00 | loss scale: 16384.0 | grad norm: 71430.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2846/ 159576 | consumed samples: 45536 | elapsed time per iteration (ms): 13633.0 | learning rate: 1.261E-05 | global batch size: 16 | lm loss: 6.681328E+00 | loss scale: 16384.0 | grad norm: 64498.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2847/ 159576 | consumed samples: 45552 | elapsed time per iteration (ms): 13680.5 | learning rate: 1.262E-05 | global batch size: 16 | lm loss: 6.708944E+00 | loss scale: 16384.0 | grad norm: 99300.512 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2848/ 159576 | consumed samples: 45568 | elapsed time per iteration (ms): 13578.9 | learning rate: 1.262E-05 | global batch size: 16 | lm loss: 6.689048E+00 | loss scale: 16384.0 | grad norm: 90482.932 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2849/ 159576 | consumed samples: 45584 | elapsed time per iteration (ms): 13613.6 | learning rate: 1.263E-05 | global batch size: 16 | lm loss: 6.673044E+00 | loss scale: 16384.0 | grad norm: 59461.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2850/ 159576 | consumed samples: 45600 | elapsed time per iteration (ms): 13675.0 | learning rate: 1.263E-05 | global batch size: 16 | lm loss: 6.738005E+00 | loss scale: 16384.0 | grad norm: 101125.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2851/ 159576 | consumed samples: 45616 | elapsed time per iteration (ms): 13897.5 | learning rate: 1.263E-05 | global batch size: 16 | lm loss: 6.522173E+00 | loss scale: 16384.0 | grad norm: 90321.174 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2852/ 159576 | consumed samples: 45632 | elapsed time per iteration (ms): 13599.3 | learning rate: 1.264E-05 | global batch size: 16 | lm loss: 6.524035E+00 | loss scale: 16384.0 | grad norm: 70117.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2853/ 159576 | consumed samples: 45648 | elapsed time per iteration (ms): 13643.7 | learning rate: 1.264E-05 | global batch size: 16 | lm loss: 6.510409E+00 | loss scale: 16384.0 | grad norm: 64993.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2854/ 159576 | consumed samples: 45664 | elapsed time per iteration (ms): 13552.1 | learning rate: 1.265E-05 | global batch size: 16 | lm loss: 6.913634E+00 | loss scale: 16384.0 | grad norm: 106101.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2855/ 159576 | consumed samples: 45680 | elapsed time per iteration (ms): 13759.3 | learning rate: 1.265E-05 | global batch size: 16 | lm loss: 6.640407E+00 | loss scale: 16384.0 | grad norm: 114581.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2856/ 159576 | consumed samples: 45696 | elapsed time per iteration (ms): 13808.3 | learning rate: 1.266E-05 | global batch size: 16 | lm loss: 6.781041E+00 | loss scale: 16384.0 | grad norm: 56604.166 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2857/ 159576 | consumed samples: 45712 | elapsed time per iteration (ms): 13620.2 | learning rate: 1.266E-05 | global batch size: 16 | lm loss: 6.794811E+00 | loss scale: 16384.0 | grad norm: 60150.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2858/ 159576 | consumed samples: 45728 | elapsed time per iteration (ms): 13675.9 | learning rate: 1.267E-05 | global batch size: 16 | lm loss: 6.586791E+00 | loss scale: 16384.0 | grad norm: 100786.813 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2859/ 159576 | consumed samples: 45744 | elapsed time per iteration (ms): 13583.4 | learning rate: 1.267E-05 | global batch size: 16 | lm loss: 6.762810E+00 | loss scale: 16384.0 | grad norm: 82968.484 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2860/ 159576 | consumed samples: 45760 | elapsed time per iteration (ms): 13906.7 | learning rate: 1.267E-05 | global batch size: 16 | lm loss: 6.739496E+00 | loss scale: 16384.0 | grad norm: 51306.674 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2861/ 159576 | consumed samples: 45776 | elapsed time per iteration (ms): 13619.1 | learning rate: 1.268E-05 | global batch size: 16 | lm loss: 6.046006E+00 | loss scale: 16384.0 | grad norm: 70726.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2862/ 159576 | consumed samples: 45792 | elapsed time per iteration (ms): 13544.2 | learning rate: 1.268E-05 | global batch size: 16 | lm loss: 6.803837E+00 | loss scale: 16384.0 | grad norm: 68740.252 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2863/ 159576 | consumed samples: 45808 | elapsed time per iteration (ms): 13610.8 | learning rate: 1.269E-05 | global batch size: 16 | lm loss: 6.770112E+00 | loss scale: 16384.0 | grad norm: 139814.235 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2864/ 159576 | consumed samples: 45824 | elapsed time per iteration (ms): 13958.0 | learning rate: 1.269E-05 | global batch size: 16 | lm loss: 6.750904E+00 | loss scale: 16384.0 | grad norm: 77621.986 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2865/ 159576 | consumed samples: 45840 | elapsed time per iteration (ms): 13670.7 | learning rate: 1.270E-05 | global batch size: 16 | lm loss: 6.696413E+00 | loss scale: 16384.0 | grad norm: 71170.214 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2866/ 159576 | consumed samples: 45856 | elapsed time per iteration (ms): 13638.6 | learning rate: 1.270E-05 | global batch size: 16 | lm loss: 6.704915E+00 | loss scale: 16384.0 | grad norm: 101640.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2867/ 159576 | consumed samples: 45872 | elapsed time per iteration (ms): 13607.2 | learning rate: 1.271E-05 | global batch size: 16 | lm loss: 6.825719E+00 | loss scale: 16384.0 | grad norm: 75740.165 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2868/ 159576 | consumed samples: 45888 | elapsed time per iteration (ms): 13630.4 | learning rate: 1.271E-05 | global batch size: 16 | lm loss: 6.287379E+00 | loss scale: 16384.0 | grad norm: 102389.724 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2869/ 159576 | consumed samples: 45904 | elapsed time per iteration (ms): 13745.4 | learning rate: 1.271E-05 | global batch size: 16 | lm loss: 6.541815E+00 | loss scale: 16384.0 | grad norm: 70149.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2870/ 159576 | consumed samples: 45920 | elapsed time per iteration (ms): 13607.8 | learning rate: 1.272E-05 | global batch size: 16 | lm loss: 6.516257E+00 | loss scale: 16384.0 | grad norm: 75996.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2871/ 159576 | consumed samples: 45936 | elapsed time per iteration (ms): 13612.1 | learning rate: 1.272E-05 | global batch size: 16 | lm loss: 6.478125E+00 | loss scale: 16384.0 | grad norm: 71923.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2872/ 159576 | consumed samples: 45952 | elapsed time per iteration (ms): 13608.0 | learning rate: 1.273E-05 | global batch size: 16 | lm loss: 6.691109E+00 | loss scale: 16384.0 | grad norm: 87426.398 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2873/ 159576 | consumed samples: 45968 | elapsed time per iteration (ms): 13976.7 | learning rate: 1.273E-05 | global batch size: 16 | lm loss: 6.620930E+00 | loss scale: 16384.0 | grad norm: 104041.099 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2874/ 159576 | consumed samples: 45984 | elapsed time per iteration (ms): 13607.9 | learning rate: 1.274E-05 | global batch size: 16 | lm loss: 6.744573E+00 | loss scale: 16384.0 | grad norm: 69927.423 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2875/ 159576 | consumed samples: 46000 | elapsed time per iteration (ms): 13661.2 | learning rate: 1.274E-05 | global batch size: 16 | lm loss: 6.676423E+00 | loss scale: 16384.0 | grad norm: 51002.581 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2876/ 159576 | consumed samples: 46016 | elapsed time per iteration (ms): 13531.2 | learning rate: 1.275E-05 | global batch size: 16 | lm loss: 6.802640E+00 | loss scale: 16384.0 | grad norm: 87004.969 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2877/ 159576 | consumed samples: 46032 | elapsed time per iteration (ms): 13901.7 | learning rate: 1.275E-05 | global batch size: 16 | lm loss: 6.729659E+00 | loss scale: 16384.0 | grad norm: 50767.745 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2878/ 159576 | consumed samples: 46048 | elapsed time per iteration (ms): 13702.1 | learning rate: 1.275E-05 | global batch size: 16 | lm loss: 6.922673E+00 | loss scale: 16384.0 | grad norm: 121433.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2879/ 159576 | consumed samples: 46064 | elapsed time per iteration (ms): 13605.9 | learning rate: 1.276E-05 | global batch size: 16 | lm loss: 6.701990E+00 | loss scale: 16384.0 | grad norm: 78796.373 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2880/ 159576 | consumed samples: 46080 | elapsed time per iteration (ms): 13615.6 | learning rate: 1.276E-05 | global batch size: 16 | lm loss: 6.650718E+00 | loss scale: 16384.0 | grad norm: 68193.269 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2881/ 159576 | consumed samples: 46096 | elapsed time per iteration (ms): 13595.5 | learning rate: 1.277E-05 | global batch size: 16 | lm loss: 6.732479E+00 | loss scale: 16384.0 | grad norm: 69049.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2882/ 159576 | consumed samples: 46112 | elapsed time per iteration (ms): 13888.6 | learning rate: 1.277E-05 | global batch size: 16 | lm loss: 6.563155E+00 | loss scale: 16384.0 | grad norm: 84383.583 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2883/ 159576 | consumed samples: 46128 | elapsed time per iteration (ms): 13560.8 | learning rate: 1.278E-05 | global batch size: 16 | lm loss: 6.406487E+00 | loss scale: 16384.0 | grad norm: 66632.669 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2884/ 159576 | consumed samples: 46144 | elapsed time per iteration (ms): 13502.0 | learning rate: 1.278E-05 | global batch size: 16 | lm loss: 6.748409E+00 | loss scale: 16384.0 | grad norm: 69626.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2885/ 159576 | consumed samples: 46160 | elapsed time per iteration (ms): 13526.3 | learning rate: 1.279E-05 | global batch size: 16 | lm loss: 6.474768E+00 | loss scale: 16384.0 | grad norm: 43811.525 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2886/ 159576 | consumed samples: 46176 | elapsed time per iteration (ms): 13863.4 | learning rate: 1.279E-05 | global batch size: 16 | lm loss: 6.661960E+00 | loss scale: 16384.0 | grad norm: 71612.680 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2887/ 159576 | consumed samples: 46192 | elapsed time per iteration (ms): 13578.7 | learning rate: 1.279E-05 | global batch size: 16 | lm loss: 6.511534E+00 | loss scale: 16384.0 | grad norm: 60456.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2888/ 159576 | consumed samples: 46208 | elapsed time per iteration (ms): 13588.8 | learning rate: 1.280E-05 | global batch size: 16 | lm loss: 6.689698E+00 | loss scale: 16384.0 | grad norm: 101410.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2889/ 159576 | consumed samples: 46224 | elapsed time per iteration (ms): 13621.2 | learning rate: 1.280E-05 | global batch size: 16 | lm loss: 6.679986E+00 | loss scale: 16384.0 | grad norm: 74313.427 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2890/ 159576 | consumed samples: 46240 | elapsed time per iteration (ms): 13599.6 | learning rate: 1.281E-05 | global batch size: 16 | lm loss: 6.579202E+00 | loss scale: 16384.0 | grad norm: 53116.582 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2891/ 159576 | consumed samples: 46256 | elapsed time per iteration (ms): 13965.8 | learning rate: 1.281E-05 | global batch size: 16 | lm loss: 6.841757E+00 | loss scale: 16384.0 | grad norm: 71980.947 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2892/ 159576 | consumed samples: 46272 | elapsed time per iteration (ms): 13517.0 | learning rate: 1.282E-05 | global batch size: 16 | lm loss: 6.555973E+00 | loss scale: 16384.0 | grad norm: 90572.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2893/ 159576 | consumed samples: 46288 | elapsed time per iteration (ms): 13525.5 | learning rate: 1.282E-05 | global batch size: 16 | lm loss: 6.857316E+00 | loss scale: 16384.0 | grad norm: 60488.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2894/ 159576 | consumed samples: 46304 | elapsed time per iteration (ms): 13541.9 | learning rate: 1.283E-05 | global batch size: 16 | lm loss: 6.685534E+00 | loss scale: 16384.0 | grad norm: 69134.968 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2895/ 159576 | consumed samples: 46320 | elapsed time per iteration (ms): 14148.5 | learning rate: 1.283E-05 | global batch size: 16 | lm loss: 6.805571E+00 | loss scale: 16384.0 | grad norm: 57858.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2896/ 159576 | consumed samples: 46336 | elapsed time per iteration (ms): 13614.8 | learning rate: 1.283E-05 | global batch size: 16 | lm loss: 6.839938E+00 | loss scale: 16384.0 | grad norm: 146916.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2897/ 159576 | consumed samples: 46352 | elapsed time per iteration (ms): 13601.5 | learning rate: 1.284E-05 | global batch size: 16 | lm loss: 6.725083E+00 | loss scale: 16384.0 | grad norm: 101921.781 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2898/ 159576 | consumed samples: 46368 | elapsed time per iteration (ms): 13584.0 | learning rate: 1.284E-05 | global batch size: 16 | lm loss: 7.088351E+00 | loss scale: 16384.0 | grad norm: 78883.090 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2899/ 159576 | consumed samples: 46384 | elapsed time per iteration (ms): 14019.6 | learning rate: 1.285E-05 | global batch size: 16 | lm loss: 6.874489E+00 | loss scale: 16384.0 | grad norm: 79406.253 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2900/ 159576 | consumed samples: 46400 | elapsed time per iteration (ms): 13571.5 | learning rate: 1.285E-05 | global batch size: 16 | lm loss: 6.735637E+00 | loss scale: 16384.0 | grad norm: 58170.467 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2901/ 159576 | consumed samples: 46416 | elapsed time per iteration (ms): 13559.8 | learning rate: 1.286E-05 | global batch size: 16 | lm loss: 6.789194E+00 | loss scale: 16384.0 | grad norm: 153130.501 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2902/ 159576 | consumed samples: 46432 | elapsed time per iteration (ms): 13570.5 | learning rate: 1.286E-05 | global batch size: 16 | lm loss: 6.734316E+00 | loss scale: 16384.0 | grad norm: 116070.395 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2903/ 159576 | consumed samples: 46448 | elapsed time per iteration (ms): 13629.7 | learning rate: 1.287E-05 | global batch size: 16 | lm loss: 6.743185E+00 | loss scale: 16384.0 | grad norm: 76970.593 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2904/ 159576 | consumed samples: 46464 | elapsed time per iteration (ms): 13980.9 | learning rate: 1.287E-05 | global batch size: 16 | lm loss: 6.742231E+00 | loss scale: 16384.0 | grad norm: 79904.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2905/ 159576 | consumed samples: 46480 | elapsed time per iteration (ms): 13647.6 | learning rate: 1.287E-05 | global batch size: 16 | lm loss: 6.785865E+00 | loss scale: 16384.0 | grad norm: 66541.967 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2906/ 159576 | consumed samples: 46496 | elapsed time per iteration (ms): 13586.1 | learning rate: 1.288E-05 | global batch size: 16 | lm loss: 6.669911E+00 | loss scale: 16384.0 | grad norm: 76560.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2907/ 159576 | consumed samples: 46512 | elapsed time per iteration (ms): 13521.3 | learning rate: 1.288E-05 | global batch size: 16 | lm loss: 6.723244E+00 | loss scale: 16384.0 | grad norm: 103466.024 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2908/ 159576 | consumed samples: 46528 | elapsed time per iteration (ms): 13824.4 | learning rate: 1.289E-05 | global batch size: 16 | lm loss: 6.584032E+00 | loss scale: 16384.0 | grad norm: 73252.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2909/ 159576 | consumed samples: 46544 | elapsed time per iteration (ms): 13578.9 | learning rate: 1.289E-05 | global batch size: 16 | lm loss: 6.804316E+00 | loss scale: 16384.0 | grad norm: 70073.019 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2910/ 159576 | consumed samples: 46560 | elapsed time per iteration (ms): 13556.4 | learning rate: 1.290E-05 | global batch size: 16 | lm loss: 6.673604E+00 | loss scale: 16384.0 | grad norm: 109090.622 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2911/ 159576 | consumed samples: 46576 | elapsed time per iteration (ms): 13604.0 | learning rate: 1.290E-05 | global batch size: 16 | lm loss: 6.599095E+00 | loss scale: 16384.0 | grad norm: 57781.446 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2912/ 159576 | consumed samples: 46592 | elapsed time per iteration (ms): 13587.1 | learning rate: 1.291E-05 | global batch size: 16 | lm loss: 6.753370E+00 | loss scale: 16384.0 | grad norm: 76832.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2913/ 159576 | consumed samples: 46608 | elapsed time per iteration (ms): 13861.5 | learning rate: 1.291E-05 | global batch size: 16 | lm loss: 6.854298E+00 | loss scale: 16384.0 | grad norm: 72132.986 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2914/ 159576 | consumed samples: 46624 | elapsed time per iteration (ms): 13559.0 | learning rate: 1.291E-05 | global batch size: 16 | lm loss: 6.579864E+00 | loss scale: 16384.0 | grad norm: 74308.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2915/ 159576 | consumed samples: 46640 | elapsed time per iteration (ms): 13594.5 | learning rate: 1.292E-05 | global batch size: 16 | lm loss: 6.756865E+00 | loss scale: 16384.0 | grad norm: 54456.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2916/ 159576 | consumed samples: 46656 | elapsed time per iteration (ms): 13569.5 | learning rate: 1.292E-05 | global batch size: 16 | lm loss: 6.743901E+00 | loss scale: 16384.0 | grad norm: 55395.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2917/ 159576 | consumed samples: 46672 | elapsed time per iteration (ms): 13964.6 | learning rate: 1.293E-05 | global batch size: 16 | lm loss: 6.671132E+00 | loss scale: 16384.0 | grad norm: 82925.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2918/ 159576 | consumed samples: 46688 | elapsed time per iteration (ms): 13641.5 | learning rate: 1.293E-05 | global batch size: 16 | lm loss: 6.554927E+00 | loss scale: 16384.0 | grad norm: 64164.151 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2919/ 159576 | consumed samples: 46704 | elapsed time per iteration (ms): 13635.2 | learning rate: 1.294E-05 | global batch size: 16 | lm loss: 6.848719E+00 | loss scale: 16384.0 | grad norm: 67718.918 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2920/ 159576 | consumed samples: 46720 | elapsed time per iteration (ms): 13603.6 | learning rate: 1.294E-05 | global batch size: 16 | lm loss: 6.609835E+00 | loss scale: 16384.0 | grad norm: 64921.543 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2921/ 159576 | consumed samples: 46736 | elapsed time per iteration (ms): 13865.5 | learning rate: 1.295E-05 | global batch size: 16 | lm loss: 6.699195E+00 | loss scale: 16384.0 | grad norm: 76865.088 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2922/ 159576 | consumed samples: 46752 | elapsed time per iteration (ms): 13659.4 | learning rate: 1.295E-05 | global batch size: 16 | lm loss: 6.821632E+00 | loss scale: 16384.0 | grad norm: 105825.800 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2923/ 159576 | consumed samples: 46768 | elapsed time per iteration (ms): 13539.7 | learning rate: 1.295E-05 | global batch size: 16 | lm loss: 6.632296E+00 | loss scale: 16384.0 | grad norm: 85548.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2924/ 159576 | consumed samples: 46784 | elapsed time per iteration (ms): 13587.6 | learning rate: 1.296E-05 | global batch size: 16 | lm loss: 6.782111E+00 | loss scale: 16384.0 | grad norm: 64005.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2925/ 159576 | consumed samples: 46800 | elapsed time per iteration (ms): 13566.6 | learning rate: 1.296E-05 | global batch size: 16 | lm loss: 6.513734E+00 | loss scale: 16384.0 | grad norm: 74875.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2926/ 159576 | consumed samples: 46816 | elapsed time per iteration (ms): 13817.4 | learning rate: 1.297E-05 | global batch size: 16 | lm loss: 6.610899E+00 | loss scale: 16384.0 | grad norm: 69678.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2927/ 159576 | consumed samples: 46832 | elapsed time per iteration (ms): 13615.5 | learning rate: 1.297E-05 | global batch size: 16 | lm loss: 7.086233E+00 | loss scale: 16384.0 | grad norm: 70522.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2928/ 159576 | consumed samples: 46848 | elapsed time per iteration (ms): 13566.8 | learning rate: 1.298E-05 | global batch size: 16 | lm loss: 6.598146E+00 | loss scale: 16384.0 | grad norm: 103276.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2929/ 159576 | consumed samples: 46864 | elapsed time per iteration (ms): 13567.1 | learning rate: 1.298E-05 | global batch size: 16 | lm loss: 6.593244E+00 | loss scale: 16384.0 | grad norm: 78523.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2930/ 159576 | consumed samples: 46880 | elapsed time per iteration (ms): 13919.4 | learning rate: 1.299E-05 | global batch size: 16 | lm loss: 6.528622E+00 | loss scale: 16384.0 | grad norm: 82737.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2931/ 159576 | consumed samples: 46896 | elapsed time per iteration (ms): 13557.6 | learning rate: 1.299E-05 | global batch size: 16 | lm loss: 6.605000E+00 | loss scale: 16384.0 | grad norm: 68077.419 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2932/ 159576 | consumed samples: 46912 | elapsed time per iteration (ms): 13570.1 | learning rate: 1.299E-05 | global batch size: 16 | lm loss: 6.595417E+00 | loss scale: 16384.0 | grad norm: 84602.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2933/ 159576 | consumed samples: 46928 | elapsed time per iteration (ms): 13606.8 | learning rate: 1.300E-05 | global batch size: 16 | lm loss: 6.730010E+00 | loss scale: 16384.0 | grad norm: 85745.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2934/ 159576 | consumed samples: 46944 | elapsed time per iteration (ms): 13584.8 | learning rate: 1.300E-05 | global batch size: 16 | lm loss: 6.689770E+00 | loss scale: 16384.0 | grad norm: 62655.073 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2935/ 159576 | consumed samples: 46960 | elapsed time per iteration (ms): 14053.4 | learning rate: 1.301E-05 | global batch size: 16 | lm loss: 6.715128E+00 | loss scale: 16384.0 | grad norm: 65695.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2936/ 159576 | consumed samples: 46976 | elapsed time per iteration (ms): 13589.9 | learning rate: 1.301E-05 | global batch size: 16 | lm loss: 6.651369E+00 | loss scale: 16384.0 | grad norm: 55322.170 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2937/ 159576 | consumed samples: 46992 | elapsed time per iteration (ms): 13553.6 | learning rate: 1.302E-05 | global batch size: 16 | lm loss: 6.646598E+00 | loss scale: 16384.0 | grad norm: 105686.832 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2938/ 159576 | consumed samples: 47008 | elapsed time per iteration (ms): 13584.5 | learning rate: 1.302E-05 | global batch size: 16 | lm loss: 6.798124E+00 | loss scale: 16384.0 | grad norm: 62478.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2939/ 159576 | consumed samples: 47024 | elapsed time per iteration (ms): 13902.5 | learning rate: 1.303E-05 | global batch size: 16 | lm loss: 6.594469E+00 | loss scale: 16384.0 | grad norm: 66128.402 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2940/ 159576 | consumed samples: 47040 | elapsed time per iteration (ms): 13632.4 | learning rate: 1.303E-05 | global batch size: 16 | lm loss: 6.642596E+00 | loss scale: 16384.0 | grad norm: 70291.389 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2941/ 159576 | consumed samples: 47056 | elapsed time per iteration (ms): 13595.9 | learning rate: 1.303E-05 | global batch size: 16 | lm loss: 6.428228E+00 | loss scale: 16384.0 | grad norm: 88273.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2942/ 159576 | consumed samples: 47072 | elapsed time per iteration (ms): 13622.0 | learning rate: 1.304E-05 | global batch size: 16 | lm loss: 6.776118E+00 | loss scale: 16384.0 | grad norm: 66140.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2943/ 159576 | consumed samples: 47088 | elapsed time per iteration (ms): 13949.2 | learning rate: 1.304E-05 | global batch size: 16 | lm loss: 6.678353E+00 | loss scale: 16384.0 | grad norm: 68411.066 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2944/ 159576 | consumed samples: 47104 | elapsed time per iteration (ms): 13581.2 | learning rate: 1.305E-05 | global batch size: 16 | lm loss: 6.679141E+00 | loss scale: 16384.0 | grad norm: 85622.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2945/ 159576 | consumed samples: 47120 | elapsed time per iteration (ms): 13544.3 | learning rate: 1.305E-05 | global batch size: 16 | lm loss: 6.620451E+00 | loss scale: 16384.0 | grad norm: 62226.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2946/ 159576 | consumed samples: 47136 | elapsed time per iteration (ms): 13593.9 | learning rate: 1.306E-05 | global batch size: 16 | lm loss: 6.719603E+00 | loss scale: 16384.0 | grad norm: 90885.483 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2947/ 159576 | consumed samples: 47152 | elapsed time per iteration (ms): 13604.3 | learning rate: 1.306E-05 | global batch size: 16 | lm loss: 6.704114E+00 | loss scale: 16384.0 | grad norm: 67182.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2948/ 159576 | consumed samples: 47168 | elapsed time per iteration (ms): 13746.5 | learning rate: 1.307E-05 | global batch size: 16 | lm loss: 6.781267E+00 | loss scale: 16384.0 | grad norm: 85616.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2949/ 159576 | consumed samples: 47184 | elapsed time per iteration (ms): 13612.1 | learning rate: 1.307E-05 | global batch size: 16 | lm loss: 6.878286E+00 | loss scale: 16384.0 | grad norm: 83807.291 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2950/ 159576 | consumed samples: 47200 | elapsed time per iteration (ms): 13656.8 | learning rate: 1.307E-05 | global batch size: 16 | lm loss: 6.808831E+00 | loss scale: 16384.0 | grad norm: 99669.258 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2951/ 159576 | consumed samples: 47216 | elapsed time per iteration (ms): 13662.4 | learning rate: 1.308E-05 | global batch size: 16 | lm loss: 6.751644E+00 | loss scale: 16384.0 | grad norm: 60477.798 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2952/ 159576 | consumed samples: 47232 | elapsed time per iteration (ms): 13999.0 | learning rate: 1.308E-05 | global batch size: 16 | lm loss: 6.593210E+00 | loss scale: 16384.0 | grad norm: 72293.070 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2953/ 159576 | consumed samples: 47248 | elapsed time per iteration (ms): 13609.1 | learning rate: 1.309E-05 | global batch size: 16 | lm loss: 6.662547E+00 | loss scale: 16384.0 | grad norm: 49910.061 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2954/ 159576 | consumed samples: 47280 | elapsed time per iteration (ms): 14635.0 | learning rate: 1.310E-05 | global batch size: 32 | lm loss: 6.688079E+00 | loss scale: 16384.0 | grad norm: 111598.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2955/ 159576 | consumed samples: 47312 | elapsed time per iteration (ms): 14591.8 | learning rate: 1.311E-05 | global batch size: 32 | lm loss: 6.657289E+00 | loss scale: 16384.0 | grad norm: 67597.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2956/ 159576 | consumed samples: 47344 | elapsed time per iteration (ms): 15030.0 | learning rate: 1.311E-05 | global batch size: 32 | lm loss: 6.554570E+00 | loss scale: 16384.0 | grad norm: 69780.387 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2957/ 159576 | consumed samples: 47376 | elapsed time per iteration (ms): 14563.7 | learning rate: 1.312E-05 | global batch size: 32 | lm loss: 6.741304E+00 | loss scale: 16384.0 | grad norm: 58633.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2958/ 159576 | consumed samples: 47408 | elapsed time per iteration (ms): 14589.9 | learning rate: 1.313E-05 | global batch size: 32 | lm loss: 6.601515E+00 | loss scale: 16384.0 | grad norm: 107295.423 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2959/ 159576 | consumed samples: 47440 | elapsed time per iteration (ms): 14625.1 | learning rate: 1.314E-05 | global batch size: 32 | lm loss: 6.683945E+00 | loss scale: 16384.0 | grad norm: 81347.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2960/ 159576 | consumed samples: 47472 | elapsed time per iteration (ms): 14964.2 | learning rate: 1.315E-05 | global batch size: 32 | lm loss: 6.790781E+00 | loss scale: 16384.0 | grad norm: 77191.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2961/ 159576 | consumed samples: 47504 | elapsed time per iteration (ms): 14557.0 | learning rate: 1.316E-05 | global batch size: 32 | lm loss: 6.749201E+00 | loss scale: 16384.0 | grad norm: 82408.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2962/ 159576 | consumed samples: 47536 | elapsed time per iteration (ms): 14666.5 | learning rate: 1.317E-05 | global batch size: 32 | lm loss: 6.532114E+00 | loss scale: 16384.0 | grad norm: 51870.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2963/ 159576 | consumed samples: 47568 | elapsed time per iteration (ms): 14537.9 | learning rate: 1.318E-05 | global batch size: 32 | lm loss: 6.660976E+00 | loss scale: 16384.0 | grad norm: 66392.838 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2964/ 159576 | consumed samples: 47600 | elapsed time per iteration (ms): 15078.8 | learning rate: 1.318E-05 | global batch size: 32 | lm loss: 6.526144E+00 | loss scale: 16384.0 | grad norm: 54716.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2965/ 159576 | consumed samples: 47632 | elapsed time per iteration (ms): 14737.9 | learning rate: 1.319E-05 | global batch size: 32 | lm loss: 6.649373E+00 | loss scale: 16384.0 | grad norm: 51359.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2966/ 159576 | consumed samples: 47664 | elapsed time per iteration (ms): 14559.9 | learning rate: 1.320E-05 | global batch size: 32 | lm loss: 6.672748E+00 | loss scale: 16384.0 | grad norm: 73789.982 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2967/ 159576 | consumed samples: 47696 | elapsed time per iteration (ms): 14642.3 | learning rate: 1.321E-05 | global batch size: 32 | lm loss: 6.662704E+00 | loss scale: 16384.0 | grad norm: 66303.151 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2968/ 159576 | consumed samples: 47728 | elapsed time per iteration (ms): 14852.7 | learning rate: 1.322E-05 | global batch size: 32 | lm loss: 6.624488E+00 | loss scale: 16384.0 | grad norm: 59052.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2969/ 159576 | consumed samples: 47760 | elapsed time per iteration (ms): 14836.6 | learning rate: 1.323E-05 | global batch size: 32 | lm loss: 6.600084E+00 | loss scale: 16384.0 | grad norm: 62547.253 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2970/ 159576 | consumed samples: 47792 | elapsed time per iteration (ms): 14593.7 | learning rate: 1.324E-05 | global batch size: 32 | lm loss: 6.517389E+00 | loss scale: 16384.0 | grad norm: 60694.546 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2971/ 159576 | consumed samples: 47824 | elapsed time per iteration (ms): 14618.4 | learning rate: 1.325E-05 | global batch size: 32 | lm loss: 6.548014E+00 | loss scale: 16384.0 | grad norm: 43913.010 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2972/ 159576 | consumed samples: 47856 | elapsed time per iteration (ms): 14695.6 | learning rate: 1.326E-05 | global batch size: 32 | lm loss: 6.593935E+00 | loss scale: 16384.0 | grad norm: 63488.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2973/ 159576 | consumed samples: 47888 | elapsed time per iteration (ms): 14827.1 | learning rate: 1.326E-05 | global batch size: 32 | lm loss: 6.572222E+00 | loss scale: 16384.0 | grad norm: 54368.202 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2974/ 159576 | consumed samples: 47920 | elapsed time per iteration (ms): 14620.6 | learning rate: 1.327E-05 | global batch size: 32 | lm loss: 6.550548E+00 | loss scale: 16384.0 | grad norm: 87940.074 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2975/ 159576 | consumed samples: 47952 | elapsed time per iteration (ms): 14622.4 | learning rate: 1.328E-05 | global batch size: 32 | lm loss: 6.529421E+00 | loss scale: 16384.0 | grad norm: 60145.649 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2976/ 159576 | consumed samples: 47984 | elapsed time per iteration (ms): 14586.4 | learning rate: 1.329E-05 | global batch size: 32 | lm loss: 6.765855E+00 | loss scale: 16384.0 | grad norm: 83899.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2977/ 159576 | consumed samples: 48016 | elapsed time per iteration (ms): 14810.9 | learning rate: 1.330E-05 | global batch size: 32 | lm loss: 6.630699E+00 | loss scale: 16384.0 | grad norm: 44149.072 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2978/ 159576 | consumed samples: 48048 | elapsed time per iteration (ms): 14685.4 | learning rate: 1.331E-05 | global batch size: 32 | lm loss: 6.561995E+00 | loss scale: 16384.0 | grad norm: 87446.523 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2979/ 159576 | consumed samples: 48080 | elapsed time per iteration (ms): 14648.9 | learning rate: 1.332E-05 | global batch size: 32 | lm loss: 6.467924E+00 | loss scale: 16384.0 | grad norm: 65034.519 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2980/ 159576 | consumed samples: 48112 | elapsed time per iteration (ms): 14615.3 | learning rate: 1.333E-05 | global batch size: 32 | lm loss: 6.649030E+00 | loss scale: 16384.0 | grad norm: 92148.403 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2981/ 159576 | consumed samples: 48144 | elapsed time per iteration (ms): 14681.7 | learning rate: 1.334E-05 | global batch size: 32 | lm loss: 6.749784E+00 | loss scale: 16384.0 | grad norm: 61670.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2982/ 159576 | consumed samples: 48176 | elapsed time per iteration (ms): 14509.6 | learning rate: 1.334E-05 | global batch size: 32 | lm loss: 6.567672E+00 | loss scale: 16384.0 | grad norm: 79628.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2983/ 159576 | consumed samples: 48208 | elapsed time per iteration (ms): 14555.2 | learning rate: 1.335E-05 | global batch size: 32 | lm loss: 6.676024E+00 | loss scale: 16384.0 | grad norm: 65136.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2984/ 159576 | consumed samples: 48240 | elapsed time per iteration (ms): 14572.2 | learning rate: 1.336E-05 | global batch size: 32 | lm loss: 6.467518E+00 | loss scale: 16384.0 | grad norm: 90637.625 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2985/ 159576 | consumed samples: 48272 | elapsed time per iteration (ms): 14888.7 | learning rate: 1.337E-05 | global batch size: 32 | lm loss: 6.586103E+00 | loss scale: 16384.0 | grad norm: 81306.452 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2986/ 159576 | consumed samples: 48304 | elapsed time per iteration (ms): 14588.0 | learning rate: 1.338E-05 | global batch size: 32 | lm loss: 6.541125E+00 | loss scale: 16384.0 | grad norm: 62368.768 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2987/ 159576 | consumed samples: 48336 | elapsed time per iteration (ms): 14597.9 | learning rate: 1.339E-05 | global batch size: 32 | lm loss: 6.591407E+00 | loss scale: 16384.0 | grad norm: 87504.003 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2988/ 159576 | consumed samples: 48368 | elapsed time per iteration (ms): 14590.3 | learning rate: 1.340E-05 | global batch size: 32 | lm loss: 6.678365E+00 | loss scale: 16384.0 | grad norm: 78293.170 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2989/ 159576 | consumed samples: 48400 | elapsed time per iteration (ms): 15031.9 | learning rate: 1.341E-05 | global batch size: 32 | lm loss: 6.564939E+00 | loss scale: 16384.0 | grad norm: 77173.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2990/ 159576 | consumed samples: 48432 | elapsed time per iteration (ms): 14705.4 | learning rate: 1.342E-05 | global batch size: 32 | lm loss: 6.692814E+00 | loss scale: 16384.0 | grad norm: 57544.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2991/ 159576 | consumed samples: 48464 | elapsed time per iteration (ms): 14586.3 | learning rate: 1.342E-05 | global batch size: 32 | lm loss: 6.628499E+00 | loss scale: 16384.0 | grad norm: 75164.585 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2992/ 159576 | consumed samples: 48496 | elapsed time per iteration (ms): 14624.5 | learning rate: 1.343E-05 | global batch size: 32 | lm loss: 6.582328E+00 | loss scale: 16384.0 | grad norm: 79666.420 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2993/ 159576 | consumed samples: 48528 | elapsed time per iteration (ms): 14950.3 | learning rate: 1.344E-05 | global batch size: 32 | lm loss: 6.558386E+00 | loss scale: 16384.0 | grad norm: 55234.958 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2994/ 159576 | consumed samples: 48560 | elapsed time per iteration (ms): 14695.8 | learning rate: 1.345E-05 | global batch size: 32 | lm loss: 6.676173E+00 | loss scale: 16384.0 | grad norm: 99524.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2995/ 159576 | consumed samples: 48592 | elapsed time per iteration (ms): 14559.9 | learning rate: 1.346E-05 | global batch size: 32 | lm loss: 6.529976E+00 | loss scale: 16384.0 | grad norm: 70438.746 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2996/ 159576 | consumed samples: 48624 | elapsed time per iteration (ms): 14633.0 | learning rate: 1.347E-05 | global batch size: 32 | lm loss: 6.703721E+00 | loss scale: 16384.0 | grad norm: 67881.530 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2997/ 159576 | consumed samples: 48656 | elapsed time per iteration (ms): 14953.9 | learning rate: 1.348E-05 | global batch size: 32 | lm loss: 6.681076E+00 | loss scale: 16384.0 | grad norm: 65351.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2998/ 159576 | consumed samples: 48688 | elapsed time per iteration (ms): 14564.8 | learning rate: 1.349E-05 | global batch size: 32 | lm loss: 6.741710E+00 | loss scale: 16384.0 | grad norm: 80607.424 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 2999/ 159576 | consumed samples: 48720 | elapsed time per iteration (ms): 14589.6 | learning rate: 1.350E-05 | global batch size: 32 | lm loss: 6.628071E+00 | loss scale: 16384.0 | grad norm: 80325.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3000/ 159576 | consumed samples: 48752 | elapsed time per iteration (ms): 14580.8 | learning rate: 1.350E-05 | global batch size: 32 | lm loss: 6.770764E+00 | loss scale: 16384.0 | grad norm: 62881.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 3000 | lm loss value: 6.650529E+00 | lm loss PPL: 7.731932E+02 | ------------------------------------------------------------------------------------------------- -saving checkpoint at iteration 3000 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-24 13:45:23,476] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step3000/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 3000 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 19970.36 - iteration 3001/ 159576 | consumed samples: 48784 | elapsed time per iteration (ms): 40923.5 | learning rate: 1.351E-05 | global batch size: 32 | lm loss: 6.512466E+00 | loss scale: 16384.0 | grad norm: 78526.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3002/ 159576 | consumed samples: 48816 | elapsed time per iteration (ms): 14454.5 | learning rate: 1.352E-05 | global batch size: 32 | lm loss: 6.725769E+00 | loss scale: 16384.0 | grad norm: 52532.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3003/ 159576 | consumed samples: 48848 | elapsed time per iteration (ms): 14508.9 | learning rate: 1.353E-05 | global batch size: 32 | lm loss: 6.778904E+00 | loss scale: 16384.0 | grad norm: 61815.208 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3004/ 159576 | consumed samples: 48880 | elapsed time per iteration (ms): 14774.8 | learning rate: 1.354E-05 | global batch size: 32 | lm loss: 6.600959E+00 | loss scale: 16384.0 | grad norm: 72563.840 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3005/ 159576 | consumed samples: 48912 | elapsed time per iteration (ms): 14543.7 | learning rate: 1.355E-05 | global batch size: 32 | lm loss: 6.630536E+00 | loss scale: 16384.0 | grad norm: 52120.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3006/ 159576 | consumed samples: 48944 | elapsed time per iteration (ms): 14501.8 | learning rate: 1.356E-05 | global batch size: 32 | lm loss: 6.661976E+00 | loss scale: 16384.0 | grad norm: 60799.900 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3007/ 159576 | consumed samples: 48976 | elapsed time per iteration (ms): 14465.0 | learning rate: 1.357E-05 | global batch size: 32 | lm loss: 6.695879E+00 | loss scale: 16384.0 | grad norm: 55470.787 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3008/ 159576 | consumed samples: 49008 | elapsed time per iteration (ms): 14696.5 | learning rate: 1.358E-05 | global batch size: 32 | lm loss: 6.613426E+00 | loss scale: 16384.0 | grad norm: 80502.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3009/ 159576 | consumed samples: 49040 | elapsed time per iteration (ms): 14441.9 | learning rate: 1.358E-05 | global batch size: 32 | lm loss: 6.640174E+00 | loss scale: 16384.0 | grad norm: 53100.335 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3010/ 159576 | consumed samples: 49072 | elapsed time per iteration (ms): 14484.3 | learning rate: 1.359E-05 | global batch size: 32 | lm loss: 6.660203E+00 | loss scale: 16384.0 | grad norm: 69573.492 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3011/ 159576 | consumed samples: 49104 | elapsed time per iteration (ms): 14599.1 | learning rate: 1.360E-05 | global batch size: 32 | lm loss: 6.674448E+00 | loss scale: 16384.0 | grad norm: 49737.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3012/ 159576 | consumed samples: 49136 | elapsed time per iteration (ms): 14701.4 | learning rate: 1.361E-05 | global batch size: 32 | lm loss: 6.607582E+00 | loss scale: 16384.0 | grad norm: 121923.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3013/ 159576 | consumed samples: 49168 | elapsed time per iteration (ms): 14527.2 | learning rate: 1.362E-05 | global batch size: 32 | lm loss: 6.552118E+00 | loss scale: 16384.0 | grad norm: 86117.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3014/ 159576 | consumed samples: 49200 | elapsed time per iteration (ms): 14528.7 | learning rate: 1.363E-05 | global batch size: 32 | lm loss: 6.628557E+00 | loss scale: 16384.0 | grad norm: 65341.685 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3015/ 159576 | consumed samples: 49232 | elapsed time per iteration (ms): 14528.2 | learning rate: 1.364E-05 | global batch size: 32 | lm loss: 6.637073E+00 | loss scale: 16384.0 | grad norm: 56388.918 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3016/ 159576 | consumed samples: 49264 | elapsed time per iteration (ms): 14818.6 | learning rate: 1.365E-05 | global batch size: 32 | lm loss: 6.643037E+00 | loss scale: 16384.0 | grad norm: 92476.269 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3017/ 159576 | consumed samples: 49296 | elapsed time per iteration (ms): 14532.4 | learning rate: 1.366E-05 | global batch size: 32 | lm loss: 6.517512E+00 | loss scale: 16384.0 | grad norm: 69528.273 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3018/ 159576 | consumed samples: 49328 | elapsed time per iteration (ms): 14482.9 | learning rate: 1.366E-05 | global batch size: 32 | lm loss: 6.593336E+00 | loss scale: 16384.0 | grad norm: 58227.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3019/ 159576 | consumed samples: 49360 | elapsed time per iteration (ms): 14483.3 | learning rate: 1.367E-05 | global batch size: 32 | lm loss: 6.682046E+00 | loss scale: 16384.0 | grad norm: 77807.619 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3020/ 159576 | consumed samples: 49392 | elapsed time per iteration (ms): 15039.4 | learning rate: 1.368E-05 | global batch size: 32 | lm loss: 6.511760E+00 | loss scale: 16384.0 | grad norm: 61711.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3021/ 159576 | consumed samples: 49424 | elapsed time per iteration (ms): 14532.3 | learning rate: 1.369E-05 | global batch size: 32 | lm loss: 6.601027E+00 | loss scale: 16384.0 | grad norm: 59045.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3022/ 159576 | consumed samples: 49456 | elapsed time per iteration (ms): 14411.9 | learning rate: 1.370E-05 | global batch size: 32 | lm loss: 6.669757E+00 | loss scale: 16384.0 | grad norm: 79072.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3023/ 159576 | consumed samples: 49488 | elapsed time per iteration (ms): 14433.5 | learning rate: 1.371E-05 | global batch size: 32 | lm loss: 6.660283E+00 | loss scale: 16384.0 | grad norm: 83581.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3024/ 159576 | consumed samples: 49520 | elapsed time per iteration (ms): 14915.2 | learning rate: 1.372E-05 | global batch size: 32 | lm loss: 6.621551E+00 | loss scale: 16384.0 | grad norm: 64854.144 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3025/ 159576 | consumed samples: 49552 | elapsed time per iteration (ms): 14425.9 | learning rate: 1.373E-05 | global batch size: 32 | lm loss: 6.591113E+00 | loss scale: 16384.0 | grad norm: 52620.079 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3026/ 159576 | consumed samples: 49584 | elapsed time per iteration (ms): 14542.0 | learning rate: 1.374E-05 | global batch size: 32 | lm loss: 6.659728E+00 | loss scale: 16384.0 | grad norm: 50471.019 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3027/ 159576 | consumed samples: 49616 | elapsed time per iteration (ms): 14493.7 | learning rate: 1.374E-05 | global batch size: 32 | lm loss: 6.786015E+00 | loss scale: 16384.0 | grad norm: 89599.838 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3028/ 159576 | consumed samples: 49648 | elapsed time per iteration (ms): 14955.9 | learning rate: 1.375E-05 | global batch size: 32 | lm loss: 6.515626E+00 | loss scale: 16384.0 | grad norm: 71757.893 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3029/ 159576 | consumed samples: 49680 | elapsed time per iteration (ms): 14451.8 | learning rate: 1.376E-05 | global batch size: 32 | lm loss: 6.552487E+00 | loss scale: 16384.0 | grad norm: 59493.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3030/ 159576 | consumed samples: 49712 | elapsed time per iteration (ms): 14565.2 | learning rate: 1.377E-05 | global batch size: 32 | lm loss: 6.515723E+00 | loss scale: 16384.0 | grad norm: 70621.618 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3031/ 159576 | consumed samples: 49744 | elapsed time per iteration (ms): 14573.9 | learning rate: 1.378E-05 | global batch size: 32 | lm loss: 6.533678E+00 | loss scale: 16384.0 | grad norm: 67416.578 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3032/ 159576 | consumed samples: 49776 | elapsed time per iteration (ms): 14838.7 | learning rate: 1.379E-05 | global batch size: 32 | lm loss: 6.558086E+00 | loss scale: 16384.0 | grad norm: 57733.715 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3033/ 159576 | consumed samples: 49808 | elapsed time per iteration (ms): 14602.8 | learning rate: 1.380E-05 | global batch size: 32 | lm loss: 6.520467E+00 | loss scale: 16384.0 | grad norm: 82103.090 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3034/ 159576 | consumed samples: 49840 | elapsed time per iteration (ms): 14562.2 | learning rate: 1.381E-05 | global batch size: 32 | lm loss: 6.583010E+00 | loss scale: 16384.0 | grad norm: 49461.985 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3035/ 159576 | consumed samples: 49872 | elapsed time per iteration (ms): 14551.2 | learning rate: 1.382E-05 | global batch size: 32 | lm loss: 6.614191E+00 | loss scale: 16384.0 | grad norm: 42934.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3036/ 159576 | consumed samples: 49904 | elapsed time per iteration (ms): 15033.1 | learning rate: 1.382E-05 | global batch size: 32 | lm loss: 6.646058E+00 | loss scale: 16384.0 | grad norm: 72475.817 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3037/ 159576 | consumed samples: 49936 | elapsed time per iteration (ms): 14506.7 | learning rate: 1.383E-05 | global batch size: 32 | lm loss: 6.657450E+00 | loss scale: 16384.0 | grad norm: 51862.308 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3038/ 159576 | consumed samples: 49968 | elapsed time per iteration (ms): 14535.4 | learning rate: 1.384E-05 | global batch size: 32 | lm loss: 6.474831E+00 | loss scale: 16384.0 | grad norm: 54826.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3039/ 159576 | consumed samples: 50000 | elapsed time per iteration (ms): 14517.2 | learning rate: 1.385E-05 | global batch size: 32 | lm loss: 6.491888E+00 | loss scale: 16384.0 | grad norm: 48045.161 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3040/ 159576 | consumed samples: 50032 | elapsed time per iteration (ms): 14679.0 | learning rate: 1.386E-05 | global batch size: 32 | lm loss: 6.557182E+00 | loss scale: 16384.0 | grad norm: 79148.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3041/ 159576 | consumed samples: 50064 | elapsed time per iteration (ms): 14829.2 | learning rate: 1.387E-05 | global batch size: 32 | lm loss: 6.624621E+00 | loss scale: 16384.0 | grad norm: 50930.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3042/ 159576 | consumed samples: 50096 | elapsed time per iteration (ms): 14560.9 | learning rate: 1.388E-05 | global batch size: 32 | lm loss: 6.572658E+00 | loss scale: 16384.0 | grad norm: 72539.370 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3043/ 159576 | consumed samples: 50128 | elapsed time per iteration (ms): 14616.0 | learning rate: 1.389E-05 | global batch size: 32 | lm loss: 6.654581E+00 | loss scale: 16384.0 | grad norm: 66089.217 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3044/ 159576 | consumed samples: 50160 | elapsed time per iteration (ms): 14597.6 | learning rate: 1.389E-05 | global batch size: 32 | lm loss: 6.568760E+00 | loss scale: 16384.0 | grad norm: 77389.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3045/ 159576 | consumed samples: 50192 | elapsed time per iteration (ms): 14717.8 | learning rate: 1.390E-05 | global batch size: 32 | lm loss: 6.562954E+00 | loss scale: 16384.0 | grad norm: 59175.329 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3046/ 159576 | consumed samples: 50224 | elapsed time per iteration (ms): 14549.8 | learning rate: 1.391E-05 | global batch size: 32 | lm loss: 6.519083E+00 | loss scale: 16384.0 | grad norm: 72573.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3047/ 159576 | consumed samples: 50256 | elapsed time per iteration (ms): 14547.8 | learning rate: 1.392E-05 | global batch size: 32 | lm loss: 6.586189E+00 | loss scale: 16384.0 | grad norm: 63454.734 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3048/ 159576 | consumed samples: 50288 | elapsed time per iteration (ms): 14699.8 | learning rate: 1.393E-05 | global batch size: 32 | lm loss: 6.629214E+00 | loss scale: 16384.0 | grad norm: 49137.706 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3049/ 159576 | consumed samples: 50320 | elapsed time per iteration (ms): 14760.5 | learning rate: 1.394E-05 | global batch size: 32 | lm loss: 6.567476E+00 | loss scale: 16384.0 | grad norm: 59423.684 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3050/ 159576 | consumed samples: 50352 | elapsed time per iteration (ms): 14605.2 | learning rate: 1.395E-05 | global batch size: 32 | lm loss: 6.560441E+00 | loss scale: 16384.0 | grad norm: 76106.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3051/ 159576 | consumed samples: 50384 | elapsed time per iteration (ms): 14589.0 | learning rate: 1.396E-05 | global batch size: 32 | lm loss: 6.676329E+00 | loss scale: 16384.0 | grad norm: 43490.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3052/ 159576 | consumed samples: 50416 | elapsed time per iteration (ms): 14546.5 | learning rate: 1.397E-05 | global batch size: 32 | lm loss: 6.531154E+00 | loss scale: 16384.0 | grad norm: 77324.537 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3053/ 159576 | consumed samples: 50448 | elapsed time per iteration (ms): 14689.5 | learning rate: 1.397E-05 | global batch size: 32 | lm loss: 6.457368E+00 | loss scale: 16384.0 | grad norm: 61005.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3054/ 159576 | consumed samples: 50480 | elapsed time per iteration (ms): 14604.5 | learning rate: 1.398E-05 | global batch size: 32 | lm loss: 6.694659E+00 | loss scale: 16384.0 | grad norm: 50570.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3055/ 159576 | consumed samples: 50512 | elapsed time per iteration (ms): 14507.3 | learning rate: 1.399E-05 | global batch size: 32 | lm loss: 6.639795E+00 | loss scale: 16384.0 | grad norm: 57017.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3056/ 159576 | consumed samples: 50544 | elapsed time per iteration (ms): 14581.4 | learning rate: 1.400E-05 | global batch size: 32 | lm loss: 6.619573E+00 | loss scale: 16384.0 | grad norm: 60323.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3057/ 159576 | consumed samples: 50576 | elapsed time per iteration (ms): 15078.3 | learning rate: 1.401E-05 | global batch size: 32 | lm loss: 6.636419E+00 | loss scale: 16384.0 | grad norm: 49598.220 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3058/ 159576 | consumed samples: 50608 | elapsed time per iteration (ms): 14576.1 | learning rate: 1.402E-05 | global batch size: 32 | lm loss: 6.591126E+00 | loss scale: 16384.0 | grad norm: 102052.253 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3059/ 159576 | consumed samples: 50640 | elapsed time per iteration (ms): 14515.1 | learning rate: 1.403E-05 | global batch size: 32 | lm loss: 6.500241E+00 | loss scale: 16384.0 | grad norm: 52981.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3060/ 159576 | consumed samples: 50672 | elapsed time per iteration (ms): 14582.7 | learning rate: 1.404E-05 | global batch size: 32 | lm loss: 6.553960E+00 | loss scale: 16384.0 | grad norm: 57341.020 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3061/ 159576 | consumed samples: 50704 | elapsed time per iteration (ms): 14939.5 | learning rate: 1.405E-05 | global batch size: 32 | lm loss: 6.593186E+00 | loss scale: 16384.0 | grad norm: 50198.577 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3062/ 159576 | consumed samples: 50736 | elapsed time per iteration (ms): 14545.7 | learning rate: 1.405E-05 | global batch size: 32 | lm loss: 6.577888E+00 | loss scale: 16384.0 | grad norm: 90008.008 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3063/ 159576 | consumed samples: 50768 | elapsed time per iteration (ms): 14515.8 | learning rate: 1.406E-05 | global batch size: 32 | lm loss: 6.775355E+00 | loss scale: 16384.0 | grad norm: 52343.221 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3064/ 159576 | consumed samples: 50800 | elapsed time per iteration (ms): 14570.2 | learning rate: 1.407E-05 | global batch size: 32 | lm loss: 6.724249E+00 | loss scale: 16384.0 | grad norm: 69939.860 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3065/ 159576 | consumed samples: 50832 | elapsed time per iteration (ms): 14913.0 | learning rate: 1.408E-05 | global batch size: 32 | lm loss: 6.634195E+00 | loss scale: 16384.0 | grad norm: 70070.520 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3066/ 159576 | consumed samples: 50864 | elapsed time per iteration (ms): 14497.8 | learning rate: 1.409E-05 | global batch size: 32 | lm loss: 6.591150E+00 | loss scale: 16384.0 | grad norm: 80109.931 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3067/ 159576 | consumed samples: 50896 | elapsed time per iteration (ms): 14593.4 | learning rate: 1.410E-05 | global batch size: 32 | lm loss: 6.637640E+00 | loss scale: 16384.0 | grad norm: 51104.322 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3068/ 159576 | consumed samples: 50928 | elapsed time per iteration (ms): 14459.7 | learning rate: 1.411E-05 | global batch size: 32 | lm loss: 6.595787E+00 | loss scale: 16384.0 | grad norm: 49458.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3069/ 159576 | consumed samples: 50960 | elapsed time per iteration (ms): 14904.6 | learning rate: 1.412E-05 | global batch size: 32 | lm loss: 6.762650E+00 | loss scale: 16384.0 | grad norm: 88087.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3070/ 159576 | consumed samples: 50992 | elapsed time per iteration (ms): 14578.7 | learning rate: 1.413E-05 | global batch size: 32 | lm loss: 6.615232E+00 | loss scale: 16384.0 | grad norm: 50851.426 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3071/ 159576 | consumed samples: 51024 | elapsed time per iteration (ms): 14534.9 | learning rate: 1.413E-05 | global batch size: 32 | lm loss: 6.502337E+00 | loss scale: 16384.0 | grad norm: 82199.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3072/ 159576 | consumed samples: 51056 | elapsed time per iteration (ms): 14555.3 | learning rate: 1.414E-05 | global batch size: 32 | lm loss: 6.552182E+00 | loss scale: 16384.0 | grad norm: 67542.628 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3073/ 159576 | consumed samples: 51088 | elapsed time per iteration (ms): 15069.2 | learning rate: 1.415E-05 | global batch size: 32 | lm loss: 6.449011E+00 | loss scale: 16384.0 | grad norm: 113973.285 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3074/ 159576 | consumed samples: 51120 | elapsed time per iteration (ms): 14473.5 | learning rate: 1.416E-05 | global batch size: 32 | lm loss: 6.462796E+00 | loss scale: 16384.0 | grad norm: 99530.753 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3075/ 159576 | consumed samples: 51152 | elapsed time per iteration (ms): 14578.5 | learning rate: 1.417E-05 | global batch size: 32 | lm loss: 6.605415E+00 | loss scale: 16384.0 | grad norm: 79580.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3076/ 159576 | consumed samples: 51184 | elapsed time per iteration (ms): 14526.0 | learning rate: 1.418E-05 | global batch size: 32 | lm loss: 6.643724E+00 | loss scale: 16384.0 | grad norm: 83910.537 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3077/ 159576 | consumed samples: 51216 | elapsed time per iteration (ms): 14932.5 | learning rate: 1.419E-05 | global batch size: 32 | lm loss: 6.554170E+00 | loss scale: 16384.0 | grad norm: 41888.605 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3078/ 159576 | consumed samples: 51248 | elapsed time per iteration (ms): 14631.5 | learning rate: 1.420E-05 | global batch size: 32 | lm loss: 6.609428E+00 | loss scale: 16384.0 | grad norm: 100795.398 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3079/ 159576 | consumed samples: 51280 | elapsed time per iteration (ms): 14613.6 | learning rate: 1.421E-05 | global batch size: 32 | lm loss: 6.647438E+00 | loss scale: 16384.0 | grad norm: 79478.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3080/ 159576 | consumed samples: 51312 | elapsed time per iteration (ms): 14624.3 | learning rate: 1.421E-05 | global batch size: 32 | lm loss: 6.548526E+00 | loss scale: 16384.0 | grad norm: 61687.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3081/ 159576 | consumed samples: 51344 | elapsed time per iteration (ms): 14941.2 | learning rate: 1.422E-05 | global batch size: 32 | lm loss: 6.559642E+00 | loss scale: 16384.0 | grad norm: 51017.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3082/ 159576 | consumed samples: 51376 | elapsed time per iteration (ms): 14650.5 | learning rate: 1.423E-05 | global batch size: 32 | lm loss: 6.513590E+00 | loss scale: 16384.0 | grad norm: 62838.209 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3083/ 159576 | consumed samples: 51408 | elapsed time per iteration (ms): 14595.1 | learning rate: 1.424E-05 | global batch size: 32 | lm loss: 6.454400E+00 | loss scale: 16384.0 | grad norm: 85218.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3084/ 159576 | consumed samples: 51440 | elapsed time per iteration (ms): 14539.5 | learning rate: 1.425E-05 | global batch size: 32 | lm loss: 6.667971E+00 | loss scale: 16384.0 | grad norm: 74883.565 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3085/ 159576 | consumed samples: 51472 | elapsed time per iteration (ms): 14496.8 | learning rate: 1.426E-05 | global batch size: 32 | lm loss: 6.608503E+00 | loss scale: 16384.0 | grad norm: 64204.771 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3086/ 159576 | consumed samples: 51504 | elapsed time per iteration (ms): 14686.0 | learning rate: 1.427E-05 | global batch size: 32 | lm loss: 6.699879E+00 | loss scale: 16384.0 | grad norm: 42613.436 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 14:06:36] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 14:06:36] PULSE: tr8-104B is running for 8:14:25 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 3087/ 159576 | consumed samples: 51536 | elapsed time per iteration (ms): 14518.6 | learning rate: 1.428E-05 | global batch size: 32 | lm loss: 6.539448E+00 | loss scale: 16384.0 | grad norm: 88063.533 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3088/ 159576 | consumed samples: 51568 | elapsed time per iteration (ms): 14588.4 | learning rate: 1.429E-05 | global batch size: 32 | lm loss: 6.589184E+00 | loss scale: 16384.0 | grad norm: 54256.309 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3089/ 159576 | consumed samples: 51600 | elapsed time per iteration (ms): 14631.0 | learning rate: 1.429E-05 | global batch size: 32 | lm loss: 6.700484E+00 | loss scale: 16384.0 | grad norm: 54269.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3090/ 159576 | consumed samples: 51632 | elapsed time per iteration (ms): 14830.4 | learning rate: 1.430E-05 | global batch size: 32 | lm loss: 6.576167E+00 | loss scale: 16384.0 | grad norm: 57490.220 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3091/ 159576 | consumed samples: 51664 | elapsed time per iteration (ms): 14445.4 | learning rate: 1.431E-05 | global batch size: 32 | lm loss: 6.601985E+00 | loss scale: 16384.0 | grad norm: 57872.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3092/ 159576 | consumed samples: 51696 | elapsed time per iteration (ms): 14536.8 | learning rate: 1.432E-05 | global batch size: 32 | lm loss: 6.407238E+00 | loss scale: 16384.0 | grad norm: 52047.068 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3093/ 159576 | consumed samples: 51728 | elapsed time per iteration (ms): 14606.0 | learning rate: 1.433E-05 | global batch size: 32 | lm loss: 6.659007E+00 | loss scale: 16384.0 | grad norm: 76903.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3094/ 159576 | consumed samples: 51760 | elapsed time per iteration (ms): 14751.8 | learning rate: 1.434E-05 | global batch size: 32 | lm loss: 6.623207E+00 | loss scale: 16384.0 | grad norm: 98639.349 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3095/ 159576 | consumed samples: 51792 | elapsed time per iteration (ms): 14636.3 | learning rate: 1.435E-05 | global batch size: 32 | lm loss: 6.697064E+00 | loss scale: 16384.0 | grad norm: 59113.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3096/ 159576 | consumed samples: 51824 | elapsed time per iteration (ms): 14701.7 | learning rate: 1.436E-05 | global batch size: 32 | lm loss: 6.510694E+00 | loss scale: 16384.0 | grad norm: 57025.627 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3097/ 159576 | consumed samples: 51856 | elapsed time per iteration (ms): 14643.0 | learning rate: 1.437E-05 | global batch size: 32 | lm loss: 6.610021E+00 | loss scale: 16384.0 | grad norm: 90059.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3098/ 159576 | consumed samples: 51888 | elapsed time per iteration (ms): 14837.7 | learning rate: 1.437E-05 | global batch size: 32 | lm loss: 6.534551E+00 | loss scale: 16384.0 | grad norm: 45874.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3099/ 159576 | consumed samples: 51920 | elapsed time per iteration (ms): 14607.4 | learning rate: 1.438E-05 | global batch size: 32 | lm loss: 6.517954E+00 | loss scale: 16384.0 | grad norm: 60226.775 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3100/ 159576 | consumed samples: 51952 | elapsed time per iteration (ms): 14537.4 | learning rate: 1.439E-05 | global batch size: 32 | lm loss: 6.457252E+00 | loss scale: 16384.0 | grad norm: 46090.904 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3101/ 159576 | consumed samples: 51984 | elapsed time per iteration (ms): 14526.9 | learning rate: 1.440E-05 | global batch size: 32 | lm loss: 6.609892E+00 | loss scale: 16384.0 | grad norm: 94724.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3102/ 159576 | consumed samples: 52016 | elapsed time per iteration (ms): 14927.9 | learning rate: 1.441E-05 | global batch size: 32 | lm loss: 6.698421E+00 | loss scale: 16384.0 | grad norm: 87402.445 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3103/ 159576 | consumed samples: 52048 | elapsed time per iteration (ms): 14723.0 | learning rate: 1.442E-05 | global batch size: 32 | lm loss: 6.607485E+00 | loss scale: 16384.0 | grad norm: 53552.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3104/ 159576 | consumed samples: 52080 | elapsed time per iteration (ms): 14655.6 | learning rate: 1.443E-05 | global batch size: 32 | lm loss: 6.771776E+00 | loss scale: 16384.0 | grad norm: 77470.084 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3105/ 159576 | consumed samples: 52112 | elapsed time per iteration (ms): 14632.7 | learning rate: 1.444E-05 | global batch size: 32 | lm loss: 6.573309E+00 | loss scale: 16384.0 | grad norm: 60932.025 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3106/ 159576 | consumed samples: 52144 | elapsed time per iteration (ms): 15115.7 | learning rate: 1.445E-05 | global batch size: 32 | lm loss: 6.610741E+00 | loss scale: 16384.0 | grad norm: 67949.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3107/ 159576 | consumed samples: 52176 | elapsed time per iteration (ms): 14559.3 | learning rate: 1.445E-05 | global batch size: 32 | lm loss: 6.538753E+00 | loss scale: 16384.0 | grad norm: 71734.909 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3108/ 159576 | consumed samples: 52208 | elapsed time per iteration (ms): 14588.4 | learning rate: 1.446E-05 | global batch size: 32 | lm loss: 6.527990E+00 | loss scale: 16384.0 | grad norm: 86170.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3109/ 159576 | consumed samples: 52240 | elapsed time per iteration (ms): 14660.3 | learning rate: 1.447E-05 | global batch size: 32 | lm loss: 6.556553E+00 | loss scale: 16384.0 | grad norm: 46751.259 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3110/ 159576 | consumed samples: 52272 | elapsed time per iteration (ms): 15046.4 | learning rate: 1.448E-05 | global batch size: 32 | lm loss: 6.566851E+00 | loss scale: 16384.0 | grad norm: 67209.417 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3111/ 159576 | consumed samples: 52304 | elapsed time per iteration (ms): 14570.9 | learning rate: 1.449E-05 | global batch size: 32 | lm loss: 6.635989E+00 | loss scale: 16384.0 | grad norm: 53538.451 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3112/ 159576 | consumed samples: 52336 | elapsed time per iteration (ms): 14664.0 | learning rate: 1.450E-05 | global batch size: 32 | lm loss: 6.739109E+00 | loss scale: 16384.0 | grad norm: 100581.395 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3113/ 159576 | consumed samples: 52368 | elapsed time per iteration (ms): 14690.0 | learning rate: 1.451E-05 | global batch size: 32 | lm loss: 6.534431E+00 | loss scale: 16384.0 | grad norm: 69366.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3114/ 159576 | consumed samples: 52400 | elapsed time per iteration (ms): 14854.6 | learning rate: 1.452E-05 | global batch size: 32 | lm loss: 6.481595E+00 | loss scale: 16384.0 | grad norm: 57933.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3115/ 159576 | consumed samples: 52432 | elapsed time per iteration (ms): 14581.0 | learning rate: 1.453E-05 | global batch size: 32 | lm loss: 6.466241E+00 | loss scale: 16384.0 | grad norm: 91764.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3116/ 159576 | consumed samples: 52464 | elapsed time per iteration (ms): 14603.8 | learning rate: 1.453E-05 | global batch size: 32 | lm loss: 6.818060E+00 | loss scale: 16384.0 | grad norm: 73322.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3117/ 159576 | consumed samples: 52496 | elapsed time per iteration (ms): 14655.4 | learning rate: 1.454E-05 | global batch size: 32 | lm loss: 6.541664E+00 | loss scale: 16384.0 | grad norm: 79876.153 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3118/ 159576 | consumed samples: 52528 | elapsed time per iteration (ms): 15059.6 | learning rate: 1.455E-05 | global batch size: 32 | lm loss: 6.582567E+00 | loss scale: 16384.0 | grad norm: 57737.032 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3119/ 159576 | consumed samples: 52560 | elapsed time per iteration (ms): 14561.2 | learning rate: 1.456E-05 | global batch size: 32 | lm loss: 6.616435E+00 | loss scale: 16384.0 | grad norm: 75078.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3120/ 159576 | consumed samples: 52592 | elapsed time per iteration (ms): 14627.9 | learning rate: 1.457E-05 | global batch size: 32 | lm loss: 6.688129E+00 | loss scale: 16384.0 | grad norm: 51450.549 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3121/ 159576 | consumed samples: 52624 | elapsed time per iteration (ms): 14579.2 | learning rate: 1.458E-05 | global batch size: 32 | lm loss: 6.456697E+00 | loss scale: 16384.0 | grad norm: 69973.541 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3122/ 159576 | consumed samples: 52656 | elapsed time per iteration (ms): 15025.4 | learning rate: 1.459E-05 | global batch size: 32 | lm loss: 6.629485E+00 | loss scale: 16384.0 | grad norm: 57268.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3123/ 159576 | consumed samples: 52688 | elapsed time per iteration (ms): 14578.8 | learning rate: 1.460E-05 | global batch size: 32 | lm loss: 6.404414E+00 | loss scale: 16384.0 | grad norm: 63882.617 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3124/ 159576 | consumed samples: 52720 | elapsed time per iteration (ms): 14582.6 | learning rate: 1.461E-05 | global batch size: 32 | lm loss: 6.473093E+00 | loss scale: 16384.0 | grad norm: 50308.716 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3125/ 159576 | consumed samples: 52752 | elapsed time per iteration (ms): 14640.7 | learning rate: 1.461E-05 | global batch size: 32 | lm loss: 6.497868E+00 | loss scale: 16384.0 | grad norm: 63650.300 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3126/ 159576 | consumed samples: 52784 | elapsed time per iteration (ms): 15046.6 | learning rate: 1.462E-05 | global batch size: 32 | lm loss: 6.549313E+00 | loss scale: 16384.0 | grad norm: 72289.376 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3127/ 159576 | consumed samples: 52816 | elapsed time per iteration (ms): 14723.2 | learning rate: 1.463E-05 | global batch size: 32 | lm loss: 6.590129E+00 | loss scale: 16384.0 | grad norm: 47547.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3128/ 159576 | consumed samples: 52848 | elapsed time per iteration (ms): 14552.7 | learning rate: 1.464E-05 | global batch size: 32 | lm loss: 6.731832E+00 | loss scale: 16384.0 | grad norm: 68103.769 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3129/ 159576 | consumed samples: 52880 | elapsed time per iteration (ms): 14573.2 | learning rate: 1.465E-05 | global batch size: 32 | lm loss: 6.528438E+00 | loss scale: 16384.0 | grad norm: 57671.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3130/ 159576 | consumed samples: 52912 | elapsed time per iteration (ms): 14663.9 | learning rate: 1.466E-05 | global batch size: 32 | lm loss: 6.672345E+00 | loss scale: 16384.0 | grad norm: 42986.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3131/ 159576 | consumed samples: 52944 | elapsed time per iteration (ms): 14852.7 | learning rate: 1.467E-05 | global batch size: 32 | lm loss: 6.489813E+00 | loss scale: 16384.0 | grad norm: 54642.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3132/ 159576 | consumed samples: 52976 | elapsed time per iteration (ms): 14644.1 | learning rate: 1.468E-05 | global batch size: 32 | lm loss: 6.597792E+00 | loss scale: 16384.0 | grad norm: 52604.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3133/ 159576 | consumed samples: 53008 | elapsed time per iteration (ms): 14641.3 | learning rate: 1.468E-05 | global batch size: 32 | lm loss: 6.527011E+00 | loss scale: 16384.0 | grad norm: 59630.271 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3134/ 159576 | consumed samples: 53040 | elapsed time per iteration (ms): 14626.4 | learning rate: 1.469E-05 | global batch size: 32 | lm loss: 6.581876E+00 | loss scale: 16384.0 | grad norm: 57219.019 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3135/ 159576 | consumed samples: 53072 | elapsed time per iteration (ms): 14774.4 | learning rate: 1.470E-05 | global batch size: 32 | lm loss: 6.708944E+00 | loss scale: 16384.0 | grad norm: 55756.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3136/ 159576 | consumed samples: 53104 | elapsed time per iteration (ms): 14618.5 | learning rate: 1.471E-05 | global batch size: 32 | lm loss: 6.679635E+00 | loss scale: 16384.0 | grad norm: 42400.449 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3137/ 159576 | consumed samples: 53136 | elapsed time per iteration (ms): 14614.4 | learning rate: 1.472E-05 | global batch size: 32 | lm loss: 6.469272E+00 | loss scale: 16384.0 | grad norm: 142351.991 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3138/ 159576 | consumed samples: 53168 | elapsed time per iteration (ms): 14596.5 | learning rate: 1.473E-05 | global batch size: 32 | lm loss: 6.554899E+00 | loss scale: 16384.0 | grad norm: 98568.745 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3139/ 159576 | consumed samples: 53200 | elapsed time per iteration (ms): 14719.6 | learning rate: 1.474E-05 | global batch size: 32 | lm loss: 6.618309E+00 | loss scale: 16384.0 | grad norm: 73504.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3140/ 159576 | consumed samples: 53232 | elapsed time per iteration (ms): 14627.2 | learning rate: 1.475E-05 | global batch size: 32 | lm loss: 6.588873E+00 | loss scale: 16384.0 | grad norm: 73534.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3141/ 159576 | consumed samples: 53264 | elapsed time per iteration (ms): 14634.4 | learning rate: 1.476E-05 | global batch size: 32 | lm loss: 6.357007E+00 | loss scale: 16384.0 | grad norm: 84712.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3142/ 159576 | consumed samples: 53296 | elapsed time per iteration (ms): 14717.8 | learning rate: 1.476E-05 | global batch size: 32 | lm loss: 6.623076E+00 | loss scale: 16384.0 | grad norm: 94140.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3143/ 159576 | consumed samples: 53328 | elapsed time per iteration (ms): 14697.5 | learning rate: 1.477E-05 | global batch size: 32 | lm loss: 6.562120E+00 | loss scale: 16384.0 | grad norm: 60657.367 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3144/ 159576 | consumed samples: 53360 | elapsed time per iteration (ms): 14578.1 | learning rate: 1.478E-05 | global batch size: 32 | lm loss: 6.445246E+00 | loss scale: 16384.0 | grad norm: 61798.286 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3145/ 159576 | consumed samples: 53392 | elapsed time per iteration (ms): 14616.8 | learning rate: 1.479E-05 | global batch size: 32 | lm loss: 6.440137E+00 | loss scale: 16384.0 | grad norm: 72537.427 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3146/ 159576 | consumed samples: 53424 | elapsed time per iteration (ms): 14619.6 | learning rate: 1.480E-05 | global batch size: 32 | lm loss: 6.739626E+00 | loss scale: 16384.0 | grad norm: 53372.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3147/ 159576 | consumed samples: 53456 | elapsed time per iteration (ms): 14895.9 | learning rate: 1.481E-05 | global batch size: 32 | lm loss: 6.588343E+00 | loss scale: 16384.0 | grad norm: 132102.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3148/ 159576 | consumed samples: 53488 | elapsed time per iteration (ms): 14681.1 | learning rate: 1.482E-05 | global batch size: 32 | lm loss: 6.551591E+00 | loss scale: 16384.0 | grad norm: 58550.161 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3149/ 159576 | consumed samples: 53520 | elapsed time per iteration (ms): 14682.3 | learning rate: 1.483E-05 | global batch size: 32 | lm loss: 6.632958E+00 | loss scale: 16384.0 | grad norm: 77007.903 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3150/ 159576 | consumed samples: 53552 | elapsed time per iteration (ms): 14624.1 | learning rate: 1.484E-05 | global batch size: 32 | lm loss: 6.648820E+00 | loss scale: 16384.0 | grad norm: 86896.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3151/ 159576 | consumed samples: 53584 | elapsed time per iteration (ms): 14845.8 | learning rate: 1.484E-05 | global batch size: 32 | lm loss: 6.446036E+00 | loss scale: 16384.0 | grad norm: 89979.541 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3152/ 159576 | consumed samples: 53616 | elapsed time per iteration (ms): 14727.8 | learning rate: 1.485E-05 | global batch size: 32 | lm loss: 6.617037E+00 | loss scale: 16384.0 | grad norm: 58488.767 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3153/ 159576 | consumed samples: 53648 | elapsed time per iteration (ms): 14649.7 | learning rate: 1.486E-05 | global batch size: 32 | lm loss: 6.529748E+00 | loss scale: 16384.0 | grad norm: 74833.007 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3154/ 159576 | consumed samples: 53680 | elapsed time per iteration (ms): 14647.6 | learning rate: 1.487E-05 | global batch size: 32 | lm loss: 6.562946E+00 | loss scale: 16384.0 | grad norm: 52935.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3155/ 159576 | consumed samples: 53712 | elapsed time per iteration (ms): 15107.7 | learning rate: 1.488E-05 | global batch size: 32 | lm loss: 6.514643E+00 | loss scale: 16384.0 | grad norm: 115570.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3156/ 159576 | consumed samples: 53744 | elapsed time per iteration (ms): 14720.1 | learning rate: 1.489E-05 | global batch size: 32 | lm loss: 6.684644E+00 | loss scale: 16384.0 | grad norm: 80957.169 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3157/ 159576 | consumed samples: 53776 | elapsed time per iteration (ms): 14692.8 | learning rate: 1.490E-05 | global batch size: 32 | lm loss: 6.519046E+00 | loss scale: 16384.0 | grad norm: 55678.918 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3158/ 159576 | consumed samples: 53808 | elapsed time per iteration (ms): 14686.5 | learning rate: 1.491E-05 | global batch size: 32 | lm loss: 6.746099E+00 | loss scale: 16384.0 | grad norm: 90492.004 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3159/ 159576 | consumed samples: 53840 | elapsed time per iteration (ms): 15011.1 | learning rate: 1.492E-05 | global batch size: 32 | lm loss: 6.536778E+00 | loss scale: 16384.0 | grad norm: 71520.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3160/ 159576 | consumed samples: 53872 | elapsed time per iteration (ms): 14579.4 | learning rate: 1.492E-05 | global batch size: 32 | lm loss: 6.666056E+00 | loss scale: 16384.0 | grad norm: 84616.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3161/ 159576 | consumed samples: 53904 | elapsed time per iteration (ms): 14644.1 | learning rate: 1.493E-05 | global batch size: 32 | lm loss: 6.597644E+00 | loss scale: 16384.0 | grad norm: 75093.664 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3162/ 159576 | consumed samples: 53936 | elapsed time per iteration (ms): 14697.1 | learning rate: 1.494E-05 | global batch size: 32 | lm loss: 6.446161E+00 | loss scale: 16384.0 | grad norm: 65649.952 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3163/ 159576 | consumed samples: 53968 | elapsed time per iteration (ms): 14947.2 | learning rate: 1.495E-05 | global batch size: 32 | lm loss: 6.681765E+00 | loss scale: 16384.0 | grad norm: 60219.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3164/ 159576 | consumed samples: 54000 | elapsed time per iteration (ms): 14663.4 | learning rate: 1.496E-05 | global batch size: 32 | lm loss: 6.525707E+00 | loss scale: 16384.0 | grad norm: 68154.761 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3165/ 159576 | consumed samples: 54032 | elapsed time per iteration (ms): 14769.3 | learning rate: 1.497E-05 | global batch size: 32 | lm loss: 6.587021E+00 | loss scale: 16384.0 | grad norm: 78180.401 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3166/ 159576 | consumed samples: 54064 | elapsed time per iteration (ms): 14610.2 | learning rate: 1.498E-05 | global batch size: 32 | lm loss: 6.519161E+00 | loss scale: 16384.0 | grad norm: 61912.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3167/ 159576 | consumed samples: 54096 | elapsed time per iteration (ms): 14999.0 | learning rate: 1.499E-05 | global batch size: 32 | lm loss: 6.632318E+00 | loss scale: 16384.0 | grad norm: 108253.525 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3168/ 159576 | consumed samples: 54128 | elapsed time per iteration (ms): 14650.1 | learning rate: 1.500E-05 | global batch size: 32 | lm loss: 6.465475E+00 | loss scale: 16384.0 | grad norm: 62950.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3169/ 159576 | consumed samples: 54160 | elapsed time per iteration (ms): 14661.3 | learning rate: 1.500E-05 | global batch size: 32 | lm loss: 6.539711E+00 | loss scale: 16384.0 | grad norm: 92615.638 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3170/ 159576 | consumed samples: 54192 | elapsed time per iteration (ms): 14674.1 | learning rate: 1.501E-05 | global batch size: 32 | lm loss: 6.579189E+00 | loss scale: 16384.0 | grad norm: 83785.863 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3171/ 159576 | consumed samples: 54224 | elapsed time per iteration (ms): 15070.8 | learning rate: 1.502E-05 | global batch size: 32 | lm loss: 6.793476E+00 | loss scale: 16384.0 | grad norm: 62540.200 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3172/ 159576 | consumed samples: 54256 | elapsed time per iteration (ms): 14666.7 | learning rate: 1.503E-05 | global batch size: 32 | lm loss: 6.584558E+00 | loss scale: 16384.0 | grad norm: 112108.286 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3173/ 159576 | consumed samples: 54288 | elapsed time per iteration (ms): 14625.8 | learning rate: 1.504E-05 | global batch size: 32 | lm loss: 6.600308E+00 | loss scale: 16384.0 | grad norm: 74654.549 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3174/ 159576 | consumed samples: 54320 | elapsed time per iteration (ms): 14636.6 | learning rate: 1.505E-05 | global batch size: 32 | lm loss: 6.586472E+00 | loss scale: 16384.0 | grad norm: 64570.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3175/ 159576 | consumed samples: 54352 | elapsed time per iteration (ms): 15097.6 | learning rate: 1.506E-05 | global batch size: 32 | lm loss: 6.611074E+00 | loss scale: 16384.0 | grad norm: 67988.200 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3176/ 159576 | consumed samples: 54384 | elapsed time per iteration (ms): 14507.7 | learning rate: 1.507E-05 | global batch size: 32 | lm loss: 6.524911E+00 | loss scale: 16384.0 | grad norm: 52695.097 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3177/ 159576 | consumed samples: 54416 | elapsed time per iteration (ms): 14667.9 | learning rate: 1.508E-05 | global batch size: 32 | lm loss: 6.622879E+00 | loss scale: 16384.0 | grad norm: 96311.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3178/ 159576 | consumed samples: 54448 | elapsed time per iteration (ms): 14717.9 | learning rate: 1.508E-05 | global batch size: 32 | lm loss: 6.557679E+00 | loss scale: 16384.0 | grad norm: 75112.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3179/ 159576 | consumed samples: 54480 | elapsed time per iteration (ms): 15028.6 | learning rate: 1.509E-05 | global batch size: 32 | lm loss: 6.508760E+00 | loss scale: 16384.0 | grad norm: 67929.222 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3180/ 159576 | consumed samples: 54512 | elapsed time per iteration (ms): 14774.6 | learning rate: 1.510E-05 | global batch size: 32 | lm loss: 6.573524E+00 | loss scale: 16384.0 | grad norm: 76526.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3181/ 159576 | consumed samples: 54544 | elapsed time per iteration (ms): 14648.5 | learning rate: 1.511E-05 | global batch size: 32 | lm loss: 6.629518E+00 | loss scale: 16384.0 | grad norm: 51441.208 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3182/ 159576 | consumed samples: 54576 | elapsed time per iteration (ms): 14620.2 | learning rate: 1.512E-05 | global batch size: 32 | lm loss: 6.528477E+00 | loss scale: 16384.0 | grad norm: 84031.596 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3183/ 159576 | consumed samples: 54608 | elapsed time per iteration (ms): 14671.0 | learning rate: 1.513E-05 | global batch size: 32 | lm loss: 6.450350E+00 | loss scale: 16384.0 | grad norm: 47787.227 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3184/ 159576 | consumed samples: 54640 | elapsed time per iteration (ms): 14835.3 | learning rate: 1.514E-05 | global batch size: 32 | lm loss: 6.547495E+00 | loss scale: 16384.0 | grad norm: 57635.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3185/ 159576 | consumed samples: 54672 | elapsed time per iteration (ms): 14691.4 | learning rate: 1.515E-05 | global batch size: 32 | lm loss: 6.438165E+00 | loss scale: 16384.0 | grad norm: 59205.412 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3186/ 159576 | consumed samples: 54704 | elapsed time per iteration (ms): 14599.9 | learning rate: 1.516E-05 | global batch size: 32 | lm loss: 6.543282E+00 | loss scale: 16384.0 | grad norm: 56916.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3187/ 159576 | consumed samples: 54736 | elapsed time per iteration (ms): 14594.3 | learning rate: 1.516E-05 | global batch size: 32 | lm loss: 6.619707E+00 | loss scale: 16384.0 | grad norm: 87429.373 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3188/ 159576 | consumed samples: 54768 | elapsed time per iteration (ms): 14717.0 | learning rate: 1.517E-05 | global batch size: 32 | lm loss: 6.575029E+00 | loss scale: 16384.0 | grad norm: 63063.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3189/ 159576 | consumed samples: 54800 | elapsed time per iteration (ms): 14535.7 | learning rate: 1.518E-05 | global batch size: 32 | lm loss: 6.572168E+00 | loss scale: 16384.0 | grad norm: 85759.591 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3190/ 159576 | consumed samples: 54832 | elapsed time per iteration (ms): 14535.8 | learning rate: 1.519E-05 | global batch size: 32 | lm loss: 6.540303E+00 | loss scale: 16384.0 | grad norm: 59464.367 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3191/ 159576 | consumed samples: 54864 | elapsed time per iteration (ms): 14477.2 | learning rate: 1.520E-05 | global batch size: 32 | lm loss: 6.545095E+00 | loss scale: 16384.0 | grad norm: 53870.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3192/ 159576 | consumed samples: 54896 | elapsed time per iteration (ms): 14651.8 | learning rate: 1.521E-05 | global batch size: 32 | lm loss: 6.497169E+00 | loss scale: 16384.0 | grad norm: 50516.018 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3193/ 159576 | consumed samples: 54928 | elapsed time per iteration (ms): 14555.7 | learning rate: 1.522E-05 | global batch size: 32 | lm loss: 6.354692E+00 | loss scale: 16384.0 | grad norm: 67216.716 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3194/ 159576 | consumed samples: 54960 | elapsed time per iteration (ms): 14548.6 | learning rate: 1.523E-05 | global batch size: 32 | lm loss: 6.704625E+00 | loss scale: 16384.0 | grad norm: 64544.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3195/ 159576 | consumed samples: 54992 | elapsed time per iteration (ms): 14549.1 | learning rate: 1.524E-05 | global batch size: 32 | lm loss: 6.489696E+00 | loss scale: 16384.0 | grad norm: 43746.021 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3196/ 159576 | consumed samples: 55024 | elapsed time per iteration (ms): 14783.1 | learning rate: 1.524E-05 | global batch size: 32 | lm loss: 6.496898E+00 | loss scale: 16384.0 | grad norm: 146573.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3197/ 159576 | consumed samples: 55056 | elapsed time per iteration (ms): 14527.9 | learning rate: 1.525E-05 | global batch size: 32 | lm loss: 6.568567E+00 | loss scale: 16384.0 | grad norm: 78804.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3198/ 159576 | consumed samples: 55088 | elapsed time per iteration (ms): 14523.2 | learning rate: 1.526E-05 | global batch size: 32 | lm loss: 6.598960E+00 | loss scale: 16384.0 | grad norm: 96783.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3199/ 159576 | consumed samples: 55120 | elapsed time per iteration (ms): 14540.7 | learning rate: 1.527E-05 | global batch size: 32 | lm loss: 6.572606E+00 | loss scale: 16384.0 | grad norm: 89417.690 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3200/ 159576 | consumed samples: 55152 | elapsed time per iteration (ms): 15008.9 | learning rate: 1.528E-05 | global batch size: 32 | lm loss: 6.506562E+00 | loss scale: 16384.0 | grad norm: 41993.325 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3201/ 159576 | consumed samples: 55184 | elapsed time per iteration (ms): 14658.0 | learning rate: 1.529E-05 | global batch size: 32 | lm loss: 6.782739E+00 | loss scale: 16384.0 | grad norm: 352113.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3202/ 159576 | consumed samples: 55216 | elapsed time per iteration (ms): 14567.2 | learning rate: 1.530E-05 | global batch size: 32 | lm loss: 6.567737E+00 | loss scale: 16384.0 | grad norm: 255563.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3203/ 159576 | consumed samples: 55248 | elapsed time per iteration (ms): 14521.2 | learning rate: 1.531E-05 | global batch size: 32 | lm loss: 6.758952E+00 | loss scale: 16384.0 | grad norm: 132639.437 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3204/ 159576 | consumed samples: 55280 | elapsed time per iteration (ms): 15057.0 | learning rate: 1.532E-05 | global batch size: 32 | lm loss: 6.644050E+00 | loss scale: 16384.0 | grad norm: 95206.523 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3205/ 159576 | consumed samples: 55312 | elapsed time per iteration (ms): 14632.3 | learning rate: 1.532E-05 | global batch size: 32 | lm loss: 6.559070E+00 | loss scale: 16384.0 | grad norm: 92448.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3206/ 159576 | consumed samples: 55344 | elapsed time per iteration (ms): 14560.7 | learning rate: 1.533E-05 | global batch size: 32 | lm loss: 6.544364E+00 | loss scale: 16384.0 | grad norm: 87185.641 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3207/ 159576 | consumed samples: 55376 | elapsed time per iteration (ms): 14559.6 | learning rate: 1.534E-05 | global batch size: 32 | lm loss: 6.617725E+00 | loss scale: 16384.0 | grad norm: 147534.405 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3208/ 159576 | consumed samples: 55408 | elapsed time per iteration (ms): 14919.1 | learning rate: 1.535E-05 | global batch size: 32 | lm loss: 6.505226E+00 | loss scale: 16384.0 | grad norm: 82317.664 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3209/ 159576 | consumed samples: 55440 | elapsed time per iteration (ms): 14628.9 | learning rate: 1.536E-05 | global batch size: 32 | lm loss: 6.529959E+00 | loss scale: 16384.0 | grad norm: 62063.357 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3210/ 159576 | consumed samples: 55472 | elapsed time per iteration (ms): 14562.8 | learning rate: 1.537E-05 | global batch size: 32 | lm loss: 6.499523E+00 | loss scale: 16384.0 | grad norm: 59027.974 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3211/ 159576 | consumed samples: 55504 | elapsed time per iteration (ms): 14551.3 | learning rate: 1.538E-05 | global batch size: 32 | lm loss: 6.612097E+00 | loss scale: 16384.0 | grad norm: 142076.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3212/ 159576 | consumed samples: 55536 | elapsed time per iteration (ms): 14906.9 | learning rate: 1.539E-05 | global batch size: 32 | lm loss: 6.726549E+00 | loss scale: 16384.0 | grad norm: 85971.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3213/ 159576 | consumed samples: 55568 | elapsed time per iteration (ms): 14484.4 | learning rate: 1.539E-05 | global batch size: 32 | lm loss: 6.627134E+00 | loss scale: 16384.0 | grad norm: 74784.069 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3214/ 159576 | consumed samples: 55600 | elapsed time per iteration (ms): 14568.5 | learning rate: 1.540E-05 | global batch size: 32 | lm loss: 6.684568E+00 | loss scale: 16384.0 | grad norm: 85537.156 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3215/ 159576 | consumed samples: 55632 | elapsed time per iteration (ms): 14541.7 | learning rate: 1.541E-05 | global batch size: 32 | lm loss: 6.632449E+00 | loss scale: 16384.0 | grad norm: 118554.262 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3216/ 159576 | consumed samples: 55664 | elapsed time per iteration (ms): 14903.9 | learning rate: 1.542E-05 | global batch size: 32 | lm loss: 6.491426E+00 | loss scale: 16384.0 | grad norm: 66361.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3217/ 159576 | consumed samples: 55696 | elapsed time per iteration (ms): 14654.1 | learning rate: 1.543E-05 | global batch size: 32 | lm loss: 6.599683E+00 | loss scale: 16384.0 | grad norm: 66284.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3218/ 159576 | consumed samples: 55728 | elapsed time per iteration (ms): 14564.4 | learning rate: 1.544E-05 | global batch size: 32 | lm loss: 6.671634E+00 | loss scale: 16384.0 | grad norm: 48626.750 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3219/ 159576 | consumed samples: 55760 | elapsed time per iteration (ms): 14567.8 | learning rate: 1.545E-05 | global batch size: 32 | lm loss: 6.653804E+00 | loss scale: 16384.0 | grad norm: 84407.596 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3220/ 159576 | consumed samples: 55792 | elapsed time per iteration (ms): 14939.3 | learning rate: 1.546E-05 | global batch size: 32 | lm loss: 6.519379E+00 | loss scale: 16384.0 | grad norm: 72885.533 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3221/ 159576 | consumed samples: 55824 | elapsed time per iteration (ms): 14579.8 | learning rate: 1.547E-05 | global batch size: 32 | lm loss: 6.658468E+00 | loss scale: 16384.0 | grad norm: 69063.419 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3222/ 159576 | consumed samples: 55856 | elapsed time per iteration (ms): 14568.3 | learning rate: 1.547E-05 | global batch size: 32 | lm loss: 6.544227E+00 | loss scale: 16384.0 | grad norm: 94167.013 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3223/ 159576 | consumed samples: 55888 | elapsed time per iteration (ms): 14530.3 | learning rate: 1.548E-05 | global batch size: 32 | lm loss: 6.519998E+00 | loss scale: 16384.0 | grad norm: 74630.691 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3224/ 159576 | consumed samples: 55920 | elapsed time per iteration (ms): 14849.7 | learning rate: 1.549E-05 | global batch size: 32 | lm loss: 6.586551E+00 | loss scale: 16384.0 | grad norm: 76630.181 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3225/ 159576 | consumed samples: 55952 | elapsed time per iteration (ms): 14888.8 | learning rate: 1.550E-05 | global batch size: 32 | lm loss: 6.687891E+00 | loss scale: 16384.0 | grad norm: 70630.932 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3226/ 159576 | consumed samples: 55984 | elapsed time per iteration (ms): 14540.3 | learning rate: 1.551E-05 | global batch size: 32 | lm loss: 6.595382E+00 | loss scale: 16384.0 | grad norm: 92178.351 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3227/ 159576 | consumed samples: 56016 | elapsed time per iteration (ms): 14557.7 | learning rate: 1.552E-05 | global batch size: 32 | lm loss: 6.364616E+00 | loss scale: 16384.0 | grad norm: 62395.737 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3228/ 159576 | consumed samples: 56048 | elapsed time per iteration (ms): 14547.2 | learning rate: 1.553E-05 | global batch size: 32 | lm loss: 6.614971E+00 | loss scale: 16384.0 | grad norm: 72348.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3229/ 159576 | consumed samples: 56080 | elapsed time per iteration (ms): 14765.8 | learning rate: 1.554E-05 | global batch size: 32 | lm loss: 6.527470E+00 | loss scale: 16384.0 | grad norm: 70068.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3230/ 159576 | consumed samples: 56112 | elapsed time per iteration (ms): 14547.7 | learning rate: 1.555E-05 | global batch size: 32 | lm loss: 6.691795E+00 | loss scale: 16384.0 | grad norm: 79540.792 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3231/ 159576 | consumed samples: 56144 | elapsed time per iteration (ms): 14659.9 | learning rate: 1.555E-05 | global batch size: 32 | lm loss: 6.541613E+00 | loss scale: 16384.0 | grad norm: 49841.975 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3232/ 159576 | consumed samples: 56176 | elapsed time per iteration (ms): 14501.9 | learning rate: 1.556E-05 | global batch size: 32 | lm loss: 6.634310E+00 | loss scale: 16384.0 | grad norm: 67541.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3233/ 159576 | consumed samples: 56208 | elapsed time per iteration (ms): 14751.5 | learning rate: 1.557E-05 | global batch size: 32 | lm loss: 6.538262E+00 | loss scale: 16384.0 | grad norm: 60234.071 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3234/ 159576 | consumed samples: 56240 | elapsed time per iteration (ms): 14540.9 | learning rate: 1.558E-05 | global batch size: 32 | lm loss: 6.572741E+00 | loss scale: 16384.0 | grad norm: 51996.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3235/ 159576 | consumed samples: 56272 | elapsed time per iteration (ms): 14525.6 | learning rate: 1.559E-05 | global batch size: 32 | lm loss: 6.514688E+00 | loss scale: 16384.0 | grad norm: 80129.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3236/ 159576 | consumed samples: 56304 | elapsed time per iteration (ms): 14525.2 | learning rate: 1.560E-05 | global batch size: 32 | lm loss: 6.597489E+00 | loss scale: 16384.0 | grad norm: 106848.471 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3237/ 159576 | consumed samples: 56336 | elapsed time per iteration (ms): 14776.9 | learning rate: 1.561E-05 | global batch size: 32 | lm loss: 6.556981E+00 | loss scale: 16384.0 | grad norm: 71439.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3238/ 159576 | consumed samples: 56368 | elapsed time per iteration (ms): 14561.5 | learning rate: 1.562E-05 | global batch size: 32 | lm loss: 6.569613E+00 | loss scale: 16384.0 | grad norm: 70525.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3239/ 159576 | consumed samples: 56400 | elapsed time per iteration (ms): 14478.4 | learning rate: 1.563E-05 | global batch size: 32 | lm loss: 6.541091E+00 | loss scale: 16384.0 | grad norm: 47017.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3240/ 159576 | consumed samples: 56432 | elapsed time per iteration (ms): 14587.1 | learning rate: 1.563E-05 | global batch size: 32 | lm loss: 6.697134E+00 | loss scale: 16384.0 | grad norm: 53866.228 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3241/ 159576 | consumed samples: 56464 | elapsed time per iteration (ms): 14901.2 | learning rate: 1.564E-05 | global batch size: 32 | lm loss: 6.463998E+00 | loss scale: 16384.0 | grad norm: 72517.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3242/ 159576 | consumed samples: 56496 | elapsed time per iteration (ms): 14602.2 | learning rate: 1.565E-05 | global batch size: 32 | lm loss: 6.557918E+00 | loss scale: 16384.0 | grad norm: 51986.356 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3243/ 159576 | consumed samples: 56528 | elapsed time per iteration (ms): 14553.6 | learning rate: 1.566E-05 | global batch size: 32 | lm loss: 6.491773E+00 | loss scale: 16384.0 | grad norm: 68222.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3244/ 159576 | consumed samples: 56560 | elapsed time per iteration (ms): 14559.7 | learning rate: 1.567E-05 | global batch size: 32 | lm loss: 6.590208E+00 | loss scale: 16384.0 | grad norm: 72691.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3245/ 159576 | consumed samples: 56592 | elapsed time per iteration (ms): 14894.6 | learning rate: 1.568E-05 | global batch size: 32 | lm loss: 6.551069E+00 | loss scale: 16384.0 | grad norm: 71227.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3246/ 159576 | consumed samples: 56624 | elapsed time per iteration (ms): 14706.4 | learning rate: 1.569E-05 | global batch size: 32 | lm loss: 6.536276E+00 | loss scale: 16384.0 | grad norm: 77853.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3247/ 159576 | consumed samples: 56656 | elapsed time per iteration (ms): 14557.1 | learning rate: 1.570E-05 | global batch size: 32 | lm loss: 6.547366E+00 | loss scale: 16384.0 | grad norm: 91853.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3248/ 159576 | consumed samples: 56688 | elapsed time per iteration (ms): 14512.9 | learning rate: 1.571E-05 | global batch size: 32 | lm loss: 6.604490E+00 | loss scale: 16384.0 | grad norm: 61725.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3249/ 159576 | consumed samples: 56720 | elapsed time per iteration (ms): 14949.1 | learning rate: 1.571E-05 | global batch size: 32 | lm loss: 6.555557E+00 | loss scale: 16384.0 | grad norm: 55414.359 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3250/ 159576 | consumed samples: 56752 | elapsed time per iteration (ms): 14468.6 | learning rate: 1.572E-05 | global batch size: 32 | lm loss: 6.471034E+00 | loss scale: 16384.0 | grad norm: 39264.272 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3251/ 159576 | consumed samples: 56784 | elapsed time per iteration (ms): 14601.9 | learning rate: 1.573E-05 | global batch size: 32 | lm loss: 6.472137E+00 | loss scale: 16384.0 | grad norm: 51720.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3252/ 159576 | consumed samples: 56816 | elapsed time per iteration (ms): 14481.3 | learning rate: 1.574E-05 | global batch size: 32 | lm loss: 6.564797E+00 | loss scale: 16384.0 | grad norm: 55129.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3253/ 159576 | consumed samples: 56848 | elapsed time per iteration (ms): 14865.7 | learning rate: 1.575E-05 | global batch size: 32 | lm loss: 6.433147E+00 | loss scale: 16384.0 | grad norm: 48761.095 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3254/ 159576 | consumed samples: 56880 | elapsed time per iteration (ms): 14607.7 | learning rate: 1.576E-05 | global batch size: 32 | lm loss: 6.486347E+00 | loss scale: 16384.0 | grad norm: 51447.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3255/ 159576 | consumed samples: 56912 | elapsed time per iteration (ms): 14476.2 | learning rate: 1.577E-05 | global batch size: 32 | lm loss: 6.670080E+00 | loss scale: 16384.0 | grad norm: 49692.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3256/ 159576 | consumed samples: 56944 | elapsed time per iteration (ms): 14532.2 | learning rate: 1.578E-05 | global batch size: 32 | lm loss: 6.449496E+00 | loss scale: 16384.0 | grad norm: 46597.035 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3257/ 159576 | consumed samples: 56976 | elapsed time per iteration (ms): 14907.4 | learning rate: 1.579E-05 | global batch size: 32 | lm loss: 6.651023E+00 | loss scale: 16384.0 | grad norm: 50509.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3258/ 159576 | consumed samples: 57008 | elapsed time per iteration (ms): 14521.0 | learning rate: 1.579E-05 | global batch size: 32 | lm loss: 6.557060E+00 | loss scale: 16384.0 | grad norm: 46431.742 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3259/ 159576 | consumed samples: 57040 | elapsed time per iteration (ms): 14527.8 | learning rate: 1.580E-05 | global batch size: 32 | lm loss: 6.802115E+00 | loss scale: 16384.0 | grad norm: 46019.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3260/ 159576 | consumed samples: 57072 | elapsed time per iteration (ms): 14560.3 | learning rate: 1.581E-05 | global batch size: 32 | lm loss: 6.480462E+00 | loss scale: 16384.0 | grad norm: 54023.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3261/ 159576 | consumed samples: 57104 | elapsed time per iteration (ms): 14898.0 | learning rate: 1.582E-05 | global batch size: 32 | lm loss: 6.696016E+00 | loss scale: 16384.0 | grad norm: 51541.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3262/ 159576 | consumed samples: 57136 | elapsed time per iteration (ms): 14574.6 | learning rate: 1.583E-05 | global batch size: 32 | lm loss: 6.633371E+00 | loss scale: 16384.0 | grad norm: 64314.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3263/ 159576 | consumed samples: 57168 | elapsed time per iteration (ms): 14524.2 | learning rate: 1.584E-05 | global batch size: 32 | lm loss: 6.540409E+00 | loss scale: 16384.0 | grad norm: 53098.615 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3264/ 159576 | consumed samples: 57200 | elapsed time per iteration (ms): 14557.6 | learning rate: 1.585E-05 | global batch size: 32 | lm loss: 6.376970E+00 | loss scale: 32768.0 | grad norm: 75107.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3265/ 159576 | consumed samples: 57232 | elapsed time per iteration (ms): 14784.4 | learning rate: 1.586E-05 | global batch size: 32 | lm loss: 6.602743E+00 | loss scale: 32768.0 | grad norm: 125297.978 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3266/ 159576 | consumed samples: 57264 | elapsed time per iteration (ms): 14634.8 | learning rate: 1.587E-05 | global batch size: 32 | lm loss: 6.514446E+00 | loss scale: 32768.0 | grad norm: 194672.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3267/ 159576 | consumed samples: 57296 | elapsed time per iteration (ms): 14570.9 | learning rate: 1.587E-05 | global batch size: 32 | lm loss: 6.630837E+00 | loss scale: 32768.0 | grad norm: 107205.101 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3268/ 159576 | consumed samples: 57328 | elapsed time per iteration (ms): 14454.1 | learning rate: 1.588E-05 | global batch size: 32 | lm loss: 6.541512E+00 | loss scale: 32768.0 | grad norm: 112309.244 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3269/ 159576 | consumed samples: 57360 | elapsed time per iteration (ms): 14551.3 | learning rate: 1.589E-05 | global batch size: 32 | lm loss: 6.542883E+00 | loss scale: 32768.0 | grad norm: 132672.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3270/ 159576 | consumed samples: 57392 | elapsed time per iteration (ms): 14718.7 | learning rate: 1.590E-05 | global batch size: 32 | lm loss: 6.448256E+00 | loss scale: 32768.0 | grad norm: 151950.192 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3271/ 159576 | consumed samples: 57424 | elapsed time per iteration (ms): 14527.0 | learning rate: 1.591E-05 | global batch size: 32 | lm loss: 6.688755E+00 | loss scale: 32768.0 | grad norm: 91675.285 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3272/ 159576 | consumed samples: 57456 | elapsed time per iteration (ms): 14559.6 | learning rate: 1.592E-05 | global batch size: 32 | lm loss: 6.550324E+00 | loss scale: 32768.0 | grad norm: 241437.766 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3273/ 159576 | consumed samples: 57488 | elapsed time per iteration (ms): 14521.4 | learning rate: 1.593E-05 | global batch size: 32 | lm loss: 6.620804E+00 | loss scale: 32768.0 | grad norm: 130842.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3274/ 159576 | consumed samples: 57520 | elapsed time per iteration (ms): 14697.5 | learning rate: 1.594E-05 | global batch size: 32 | lm loss: 6.459725E+00 | loss scale: 32768.0 | grad norm: 146465.533 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3275/ 159576 | consumed samples: 57552 | elapsed time per iteration (ms): 14476.2 | learning rate: 1.595E-05 | global batch size: 32 | lm loss: 6.576751E+00 | loss scale: 32768.0 | grad norm: 114711.531 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3276/ 159576 | consumed samples: 57584 | elapsed time per iteration (ms): 14512.4 | learning rate: 1.595E-05 | global batch size: 32 | lm loss: 6.599717E+00 | loss scale: 32768.0 | grad norm: 283220.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3277/ 159576 | consumed samples: 57616 | elapsed time per iteration (ms): 14565.0 | learning rate: 1.596E-05 | global batch size: 32 | lm loss: 6.395351E+00 | loss scale: 32768.0 | grad norm: 206105.392 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3278/ 159576 | consumed samples: 57648 | elapsed time per iteration (ms): 14816.8 | learning rate: 1.597E-05 | global batch size: 32 | lm loss: 6.569580E+00 | loss scale: 32768.0 | grad norm: 183586.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3279/ 159576 | consumed samples: 57680 | elapsed time per iteration (ms): 14615.5 | learning rate: 1.598E-05 | global batch size: 32 | lm loss: 6.572281E+00 | loss scale: 32768.0 | grad norm: 161878.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3280/ 159576 | consumed samples: 57712 | elapsed time per iteration (ms): 14521.1 | learning rate: 1.599E-05 | global batch size: 32 | lm loss: 6.513469E+00 | loss scale: 32768.0 | grad norm: 134922.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3281/ 159576 | consumed samples: 57744 | elapsed time per iteration (ms): 14549.6 | learning rate: 1.600E-05 | global batch size: 32 | lm loss: 6.680450E+00 | loss scale: 32768.0 | grad norm: 214593.393 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3282/ 159576 | consumed samples: 57776 | elapsed time per iteration (ms): 14885.6 | learning rate: 1.601E-05 | global batch size: 32 | lm loss: 6.528894E+00 | loss scale: 32768.0 | grad norm: 136120.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3283/ 159576 | consumed samples: 57808 | elapsed time per iteration (ms): 14648.1 | learning rate: 1.602E-05 | global batch size: 32 | lm loss: 6.610715E+00 | loss scale: 32768.0 | grad norm: 124689.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3284/ 159576 | consumed samples: 57840 | elapsed time per iteration (ms): 14446.0 | learning rate: 1.603E-05 | global batch size: 32 | lm loss: 6.493599E+00 | loss scale: 32768.0 | grad norm: 193703.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3285/ 159576 | consumed samples: 57872 | elapsed time per iteration (ms): 14530.4 | learning rate: 1.603E-05 | global batch size: 32 | lm loss: 6.495665E+00 | loss scale: 32768.0 | grad norm: 180680.235 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3286/ 159576 | consumed samples: 57904 | elapsed time per iteration (ms): 15079.8 | learning rate: 1.604E-05 | global batch size: 32 | lm loss: 6.484368E+00 | loss scale: 32768.0 | grad norm: 151352.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3287/ 159576 | consumed samples: 57936 | elapsed time per iteration (ms): 14519.7 | learning rate: 1.605E-05 | global batch size: 32 | lm loss: 6.533234E+00 | loss scale: 32768.0 | grad norm: 135972.403 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3288/ 159576 | consumed samples: 57968 | elapsed time per iteration (ms): 14502.1 | learning rate: 1.606E-05 | global batch size: 32 | lm loss: 6.485931E+00 | loss scale: 32768.0 | grad norm: 175469.781 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3289/ 159576 | consumed samples: 58000 | elapsed time per iteration (ms): 14650.6 | learning rate: 1.607E-05 | global batch size: 32 | lm loss: 6.588792E+00 | loss scale: 32768.0 | grad norm: 95804.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3290/ 159576 | consumed samples: 58032 | elapsed time per iteration (ms): 15011.0 | learning rate: 1.608E-05 | global batch size: 32 | lm loss: 6.649066E+00 | loss scale: 32768.0 | grad norm: 158912.035 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3291/ 159576 | consumed samples: 58064 | elapsed time per iteration (ms): 14545.2 | learning rate: 1.609E-05 | global batch size: 32 | lm loss: 6.518328E+00 | loss scale: 32768.0 | grad norm: 143118.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3292/ 159576 | consumed samples: 58096 | elapsed time per iteration (ms): 14548.9 | learning rate: 1.610E-05 | global batch size: 32 | lm loss: 6.497085E+00 | loss scale: 32768.0 | grad norm: 242609.168 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3293/ 159576 | consumed samples: 58128 | elapsed time per iteration (ms): 14674.4 | learning rate: 1.611E-05 | global batch size: 32 | lm loss: 6.516074E+00 | loss scale: 32768.0 | grad norm: 230563.177 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3294/ 159576 | consumed samples: 58160 | elapsed time per iteration (ms): 15018.5 | learning rate: 1.611E-05 | global batch size: 32 | lm loss: 6.357250E+00 | loss scale: 32768.0 | grad norm: 145279.947 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3295/ 159576 | consumed samples: 58192 | elapsed time per iteration (ms): 14502.4 | learning rate: 1.612E-05 | global batch size: 32 | lm loss: 6.532835E+00 | loss scale: 32768.0 | grad norm: 159209.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3296/ 159576 | consumed samples: 58224 | elapsed time per iteration (ms): 14618.1 | learning rate: 1.613E-05 | global batch size: 32 | lm loss: 6.610238E+00 | loss scale: 32768.0 | grad norm: 103662.453 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3297/ 159576 | consumed samples: 58256 | elapsed time per iteration (ms): 14641.0 | learning rate: 1.614E-05 | global batch size: 32 | lm loss: 6.559636E+00 | loss scale: 32768.0 | grad norm: 342247.705 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3298/ 159576 | consumed samples: 58288 | elapsed time per iteration (ms): 14987.0 | learning rate: 1.615E-05 | global batch size: 32 | lm loss: 6.595356E+00 | loss scale: 32768.0 | grad norm: 185444.091 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3299/ 159576 | consumed samples: 58320 | elapsed time per iteration (ms): 14547.8 | learning rate: 1.616E-05 | global batch size: 32 | lm loss: 6.538537E+00 | loss scale: 32768.0 | grad norm: 145127.777 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3300/ 159576 | consumed samples: 58352 | elapsed time per iteration (ms): 14643.9 | learning rate: 1.617E-05 | global batch size: 32 | lm loss: 6.453721E+00 | loss scale: 32768.0 | grad norm: 235646.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3301/ 159576 | consumed samples: 58384 | elapsed time per iteration (ms): 14648.1 | learning rate: 1.618E-05 | global batch size: 32 | lm loss: 6.672456E+00 | loss scale: 32768.0 | grad norm: 131805.014 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3302/ 159576 | consumed samples: 58416 | elapsed time per iteration (ms): 15043.8 | learning rate: 1.618E-05 | global batch size: 32 | lm loss: 6.513996E+00 | loss scale: 32768.0 | grad norm: 172559.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3303/ 159576 | consumed samples: 58448 | elapsed time per iteration (ms): 14557.7 | learning rate: 1.619E-05 | global batch size: 32 | lm loss: 6.688443E+00 | loss scale: 32768.0 | grad norm: 154181.843 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3304/ 159576 | consumed samples: 58480 | elapsed time per iteration (ms): 14541.6 | learning rate: 1.620E-05 | global batch size: 32 | lm loss: 6.865191E+00 | loss scale: 32768.0 | grad norm: 171141.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3305/ 159576 | consumed samples: 58512 | elapsed time per iteration (ms): 14558.8 | learning rate: 1.621E-05 | global batch size: 32 | lm loss: 6.529626E+00 | loss scale: 32768.0 | grad norm: 112641.720 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3306/ 159576 | consumed samples: 58544 | elapsed time per iteration (ms): 14971.5 | learning rate: 1.622E-05 | global batch size: 32 | lm loss: 6.571610E+00 | loss scale: 32768.0 | grad norm: 115411.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3307/ 159576 | consumed samples: 58576 | elapsed time per iteration (ms): 14532.6 | learning rate: 1.623E-05 | global batch size: 32 | lm loss: 6.792900E+00 | loss scale: 32768.0 | grad norm: 153224.669 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3308/ 159576 | consumed samples: 58608 | elapsed time per iteration (ms): 14639.5 | learning rate: 1.624E-05 | global batch size: 32 | lm loss: 6.490854E+00 | loss scale: 32768.0 | grad norm: 125276.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3309/ 159576 | consumed samples: 58640 | elapsed time per iteration (ms): 14639.4 | learning rate: 1.625E-05 | global batch size: 32 | lm loss: 6.604795E+00 | loss scale: 32768.0 | grad norm: 163307.330 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3310/ 159576 | consumed samples: 58672 | elapsed time per iteration (ms): 14641.3 | learning rate: 1.626E-05 | global batch size: 32 | lm loss: 6.486001E+00 | loss scale: 32768.0 | grad norm: 169732.209 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3311/ 159576 | consumed samples: 58704 | elapsed time per iteration (ms): 14763.3 | learning rate: 1.626E-05 | global batch size: 32 | lm loss: 6.513995E+00 | loss scale: 32768.0 | grad norm: 106129.577 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3312/ 159576 | consumed samples: 58736 | elapsed time per iteration (ms): 14481.4 | learning rate: 1.627E-05 | global batch size: 32 | lm loss: 6.538834E+00 | loss scale: 32768.0 | grad norm: 143827.047 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3313/ 159576 | consumed samples: 58768 | elapsed time per iteration (ms): 14535.0 | learning rate: 1.628E-05 | global batch size: 32 | lm loss: 6.508898E+00 | loss scale: 32768.0 | grad norm: 96517.736 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3314/ 159576 | consumed samples: 58800 | elapsed time per iteration (ms): 14389.3 | learning rate: 1.629E-05 | global batch size: 32 | lm loss: 6.557344E+00 | loss scale: 32768.0 | grad norm: 160647.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3315/ 159576 | consumed samples: 58832 | elapsed time per iteration (ms): 14617.9 | learning rate: 1.630E-05 | global batch size: 32 | lm loss: 6.579730E+00 | loss scale: 32768.0 | grad norm: 166511.304 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3316/ 159576 | consumed samples: 58864 | elapsed time per iteration (ms): 14527.6 | learning rate: 1.631E-05 | global batch size: 32 | lm loss: 6.510201E+00 | loss scale: 32768.0 | grad norm: 147882.179 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3317/ 159576 | consumed samples: 58896 | elapsed time per iteration (ms): 14470.3 | learning rate: 1.632E-05 | global batch size: 32 | lm loss: 6.570679E+00 | loss scale: 32768.0 | grad norm: 133948.873 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3318/ 159576 | consumed samples: 58928 | elapsed time per iteration (ms): 14503.9 | learning rate: 1.633E-05 | global batch size: 32 | lm loss: 6.505450E+00 | loss scale: 32768.0 | grad norm: 117987.444 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3319/ 159576 | consumed samples: 58960 | elapsed time per iteration (ms): 14576.7 | learning rate: 1.634E-05 | global batch size: 32 | lm loss: 6.637349E+00 | loss scale: 32768.0 | grad norm: 158753.005 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3320/ 159576 | consumed samples: 58992 | elapsed time per iteration (ms): 14474.5 | learning rate: 1.634E-05 | global batch size: 32 | lm loss: 6.463197E+00 | loss scale: 32768.0 | grad norm: 133223.814 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3321/ 159576 | consumed samples: 59024 | elapsed time per iteration (ms): 14495.2 | learning rate: 1.635E-05 | global batch size: 32 | lm loss: 6.754025E+00 | loss scale: 32768.0 | grad norm: 147882.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3322/ 159576 | consumed samples: 59056 | elapsed time per iteration (ms): 14426.8 | learning rate: 1.636E-05 | global batch size: 32 | lm loss: 6.377756E+00 | loss scale: 32768.0 | grad norm: 107176.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3323/ 159576 | consumed samples: 59088 | elapsed time per iteration (ms): 14894.2 | learning rate: 1.637E-05 | global batch size: 32 | lm loss: 6.485399E+00 | loss scale: 32768.0 | grad norm: 104276.979 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3324/ 159576 | consumed samples: 59120 | elapsed time per iteration (ms): 14539.8 | learning rate: 1.638E-05 | global batch size: 32 | lm loss: 6.595620E+00 | loss scale: 32768.0 | grad norm: 102253.453 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3325/ 159576 | consumed samples: 59152 | elapsed time per iteration (ms): 14528.7 | learning rate: 1.639E-05 | global batch size: 32 | lm loss: 6.372971E+00 | loss scale: 32768.0 | grad norm: 170203.107 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3326/ 159576 | consumed samples: 59184 | elapsed time per iteration (ms): 14629.3 | learning rate: 1.640E-05 | global batch size: 32 | lm loss: 6.460327E+00 | loss scale: 32768.0 | grad norm: 108888.403 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3327/ 159576 | consumed samples: 59216 | elapsed time per iteration (ms): 15011.9 | learning rate: 1.641E-05 | global batch size: 32 | lm loss: 6.462082E+00 | loss scale: 32768.0 | grad norm: 154915.863 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3328/ 159576 | consumed samples: 59248 | elapsed time per iteration (ms): 14457.0 | learning rate: 1.642E-05 | global batch size: 32 | lm loss: 6.526529E+00 | loss scale: 32768.0 | grad norm: 135486.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3329/ 159576 | consumed samples: 59280 | elapsed time per iteration (ms): 14493.0 | learning rate: 1.642E-05 | global batch size: 32 | lm loss: 6.546029E+00 | loss scale: 32768.0 | grad norm: 97252.237 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3330/ 159576 | consumed samples: 59312 | elapsed time per iteration (ms): 14488.7 | learning rate: 1.643E-05 | global batch size: 32 | lm loss: 6.540400E+00 | loss scale: 32768.0 | grad norm: 234564.204 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3331/ 159576 | consumed samples: 59344 | elapsed time per iteration (ms): 14982.7 | learning rate: 1.644E-05 | global batch size: 32 | lm loss: 6.473689E+00 | loss scale: 32768.0 | grad norm: 104411.499 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3332/ 159576 | consumed samples: 59376 | elapsed time per iteration (ms): 14455.1 | learning rate: 1.645E-05 | global batch size: 32 | lm loss: 6.589927E+00 | loss scale: 32768.0 | grad norm: 240696.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 15:06:48] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 15:06:48] PULSE: tr8-104B is running for 9:14:37 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 3333/ 159576 | consumed samples: 59408 | elapsed time per iteration (ms): 14571.6 | learning rate: 1.646E-05 | global batch size: 32 | lm loss: 6.604051E+00 | loss scale: 32768.0 | grad norm: 150869.363 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3334/ 159576 | consumed samples: 59440 | elapsed time per iteration (ms): 14495.5 | learning rate: 1.647E-05 | global batch size: 32 | lm loss: 6.565775E+00 | loss scale: 32768.0 | grad norm: 141203.105 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3335/ 159576 | consumed samples: 59472 | elapsed time per iteration (ms): 14896.4 | learning rate: 1.648E-05 | global batch size: 32 | lm loss: 6.456505E+00 | loss scale: 32768.0 | grad norm: 145244.969 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3336/ 159576 | consumed samples: 59504 | elapsed time per iteration (ms): 14515.3 | learning rate: 1.649E-05 | global batch size: 32 | lm loss: 6.488969E+00 | loss scale: 32768.0 | grad norm: 246097.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3337/ 159576 | consumed samples: 59536 | elapsed time per iteration (ms): 14492.7 | learning rate: 1.650E-05 | global batch size: 32 | lm loss: 6.455498E+00 | loss scale: 32768.0 | grad norm: 130955.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3338/ 159576 | consumed samples: 59568 | elapsed time per iteration (ms): 14531.1 | learning rate: 1.650E-05 | global batch size: 32 | lm loss: 6.593586E+00 | loss scale: 32768.0 | grad norm: 136721.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3339/ 159576 | consumed samples: 59600 | elapsed time per iteration (ms): 14962.3 | learning rate: 1.651E-05 | global batch size: 32 | lm loss: 6.564628E+00 | loss scale: 32768.0 | grad norm: 141976.785 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3340/ 159576 | consumed samples: 59632 | elapsed time per iteration (ms): 14550.8 | learning rate: 1.652E-05 | global batch size: 32 | lm loss: 6.373518E+00 | loss scale: 32768.0 | grad norm: 113008.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3341/ 159576 | consumed samples: 59664 | elapsed time per iteration (ms): 14563.2 | learning rate: 1.653E-05 | global batch size: 32 | lm loss: 6.658302E+00 | loss scale: 32768.0 | grad norm: 113653.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3342/ 159576 | consumed samples: 59696 | elapsed time per iteration (ms): 14584.3 | learning rate: 1.654E-05 | global batch size: 32 | lm loss: 6.485311E+00 | loss scale: 32768.0 | grad norm: 162130.401 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3343/ 159576 | consumed samples: 59728 | elapsed time per iteration (ms): 14879.0 | learning rate: 1.655E-05 | global batch size: 32 | lm loss: 6.461338E+00 | loss scale: 32768.0 | grad norm: 284392.029 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3344/ 159576 | consumed samples: 59760 | elapsed time per iteration (ms): 14679.3 | learning rate: 1.656E-05 | global batch size: 32 | lm loss: 6.473630E+00 | loss scale: 32768.0 | grad norm: 142043.769 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3345/ 159576 | consumed samples: 59792 | elapsed time per iteration (ms): 14580.5 | learning rate: 1.657E-05 | global batch size: 32 | lm loss: 6.494667E+00 | loss scale: 32768.0 | grad norm: 125366.936 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3346/ 159576 | consumed samples: 59824 | elapsed time per iteration (ms): 14552.3 | learning rate: 1.658E-05 | global batch size: 32 | lm loss: 6.560155E+00 | loss scale: 32768.0 | grad norm: 126654.040 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3347/ 159576 | consumed samples: 59856 | elapsed time per iteration (ms): 14707.5 | learning rate: 1.658E-05 | global batch size: 32 | lm loss: 6.462931E+00 | loss scale: 32768.0 | grad norm: 123122.209 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3348/ 159576 | consumed samples: 59888 | elapsed time per iteration (ms): 14897.9 | learning rate: 1.659E-05 | global batch size: 32 | lm loss: 6.542427E+00 | loss scale: 32768.0 | grad norm: 147629.605 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3349/ 159576 | consumed samples: 59920 | elapsed time per iteration (ms): 14638.7 | learning rate: 1.660E-05 | global batch size: 32 | lm loss: 6.508281E+00 | loss scale: 32768.0 | grad norm: 181625.911 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3350/ 159576 | consumed samples: 59952 | elapsed time per iteration (ms): 14590.8 | learning rate: 1.661E-05 | global batch size: 32 | lm loss: 6.592540E+00 | loss scale: 32768.0 | grad norm: 161023.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3351/ 159576 | consumed samples: 59984 | elapsed time per iteration (ms): 14484.6 | learning rate: 1.662E-05 | global batch size: 32 | lm loss: 6.474733E+00 | loss scale: 32768.0 | grad norm: 125810.881 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3352/ 159576 | consumed samples: 60016 | elapsed time per iteration (ms): 14782.0 | learning rate: 1.663E-05 | global batch size: 32 | lm loss: 6.515071E+00 | loss scale: 32768.0 | grad norm: 148493.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3353/ 159576 | consumed samples: 60048 | elapsed time per iteration (ms): 14601.7 | learning rate: 1.664E-05 | global batch size: 32 | lm loss: 6.510946E+00 | loss scale: 32768.0 | grad norm: 154098.157 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3354/ 159576 | consumed samples: 60080 | elapsed time per iteration (ms): 14551.7 | learning rate: 1.665E-05 | global batch size: 32 | lm loss: 6.639778E+00 | loss scale: 32768.0 | grad norm: 120125.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3355/ 159576 | consumed samples: 60112 | elapsed time per iteration (ms): 14609.6 | learning rate: 1.666E-05 | global batch size: 32 | lm loss: 6.582976E+00 | loss scale: 32768.0 | grad norm: 125934.744 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3356/ 159576 | consumed samples: 60144 | elapsed time per iteration (ms): 14773.2 | learning rate: 1.666E-05 | global batch size: 32 | lm loss: 6.492831E+00 | loss scale: 32768.0 | grad norm: 114199.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3357/ 159576 | consumed samples: 60176 | elapsed time per iteration (ms): 14529.3 | learning rate: 1.667E-05 | global batch size: 32 | lm loss: 6.348350E+00 | loss scale: 32768.0 | grad norm: 224039.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3358/ 159576 | consumed samples: 60208 | elapsed time per iteration (ms): 14555.6 | learning rate: 1.668E-05 | global batch size: 32 | lm loss: 6.556470E+00 | loss scale: 32768.0 | grad norm: 104992.577 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3359/ 159576 | consumed samples: 60240 | elapsed time per iteration (ms): 14550.6 | learning rate: 1.669E-05 | global batch size: 32 | lm loss: 6.499870E+00 | loss scale: 32768.0 | grad norm: 135382.686 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3360/ 159576 | consumed samples: 60272 | elapsed time per iteration (ms): 14838.2 | learning rate: 1.670E-05 | global batch size: 32 | lm loss: 6.482747E+00 | loss scale: 32768.0 | grad norm: 128815.280 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3361/ 159576 | consumed samples: 60304 | elapsed time per iteration (ms): 14577.3 | learning rate: 1.671E-05 | global batch size: 32 | lm loss: 6.564407E+00 | loss scale: 32768.0 | grad norm: 220163.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3362/ 159576 | consumed samples: 60336 | elapsed time per iteration (ms): 14600.9 | learning rate: 1.672E-05 | global batch size: 32 | lm loss: 6.561186E+00 | loss scale: 32768.0 | grad norm: 110111.851 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3363/ 159576 | consumed samples: 60368 | elapsed time per iteration (ms): 14665.2 | learning rate: 1.673E-05 | global batch size: 32 | lm loss: 6.624823E+00 | loss scale: 32768.0 | grad norm: 119091.671 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3364/ 159576 | consumed samples: 60400 | elapsed time per iteration (ms): 14799.6 | learning rate: 1.674E-05 | global batch size: 32 | lm loss: 6.572470E+00 | loss scale: 32768.0 | grad norm: 157986.014 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3365/ 159576 | consumed samples: 60432 | elapsed time per iteration (ms): 14663.0 | learning rate: 1.674E-05 | global batch size: 32 | lm loss: 6.613792E+00 | loss scale: 32768.0 | grad norm: 103982.392 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3366/ 159576 | consumed samples: 60464 | elapsed time per iteration (ms): 14481.2 | learning rate: 1.675E-05 | global batch size: 32 | lm loss: 6.387408E+00 | loss scale: 32768.0 | grad norm: 158220.191 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3367/ 159576 | consumed samples: 60496 | elapsed time per iteration (ms): 14521.1 | learning rate: 1.676E-05 | global batch size: 32 | lm loss: 6.515392E+00 | loss scale: 32768.0 | grad norm: 123622.262 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3368/ 159576 | consumed samples: 60528 | elapsed time per iteration (ms): 15053.7 | learning rate: 1.677E-05 | global batch size: 32 | lm loss: 6.568096E+00 | loss scale: 32768.0 | grad norm: 255456.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3369/ 159576 | consumed samples: 60560 | elapsed time per iteration (ms): 14696.0 | learning rate: 1.678E-05 | global batch size: 32 | lm loss: 6.553046E+00 | loss scale: 32768.0 | grad norm: 144928.658 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3370/ 159576 | consumed samples: 60592 | elapsed time per iteration (ms): 14594.8 | learning rate: 1.679E-05 | global batch size: 32 | lm loss: 6.341058E+00 | loss scale: 32768.0 | grad norm: 190527.660 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3371/ 159576 | consumed samples: 60624 | elapsed time per iteration (ms): 14611.4 | learning rate: 1.680E-05 | global batch size: 32 | lm loss: 6.406933E+00 | loss scale: 32768.0 | grad norm: 164464.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3372/ 159576 | consumed samples: 60656 | elapsed time per iteration (ms): 14997.7 | learning rate: 1.681E-05 | global batch size: 32 | lm loss: 6.472693E+00 | loss scale: 32768.0 | grad norm: 140499.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3373/ 159576 | consumed samples: 60688 | elapsed time per iteration (ms): 14555.5 | learning rate: 1.682E-05 | global batch size: 32 | lm loss: 6.472823E+00 | loss scale: 32768.0 | grad norm: 209200.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3374/ 159576 | consumed samples: 60720 | elapsed time per iteration (ms): 14538.5 | learning rate: 1.682E-05 | global batch size: 32 | lm loss: 6.575472E+00 | loss scale: 32768.0 | grad norm: 152311.741 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3375/ 159576 | consumed samples: 60752 | elapsed time per iteration (ms): 14542.0 | learning rate: 1.683E-05 | global batch size: 32 | lm loss: 6.559402E+00 | loss scale: 32768.0 | grad norm: 139207.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3376/ 159576 | consumed samples: 60784 | elapsed time per iteration (ms): 14908.5 | learning rate: 1.684E-05 | global batch size: 32 | lm loss: 6.450352E+00 | loss scale: 32768.0 | grad norm: 132808.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3377/ 159576 | consumed samples: 60816 | elapsed time per iteration (ms): 14576.3 | learning rate: 1.685E-05 | global batch size: 32 | lm loss: 6.365215E+00 | loss scale: 32768.0 | grad norm: 176292.223 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3378/ 159576 | consumed samples: 60848 | elapsed time per iteration (ms): 14602.1 | learning rate: 1.686E-05 | global batch size: 32 | lm loss: 6.443403E+00 | loss scale: 32768.0 | grad norm: 123052.341 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3379/ 159576 | consumed samples: 60880 | elapsed time per iteration (ms): 14651.7 | learning rate: 1.687E-05 | global batch size: 32 | lm loss: 6.502498E+00 | loss scale: 32768.0 | grad norm: 100381.015 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3380/ 159576 | consumed samples: 60912 | elapsed time per iteration (ms): 14854.4 | learning rate: 1.688E-05 | global batch size: 32 | lm loss: 6.296595E+00 | loss scale: 32768.0 | grad norm: 110161.712 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3381/ 159576 | consumed samples: 60944 | elapsed time per iteration (ms): 14541.8 | learning rate: 1.689E-05 | global batch size: 32 | lm loss: 6.563570E+00 | loss scale: 32768.0 | grad norm: 88591.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3382/ 159576 | consumed samples: 60976 | elapsed time per iteration (ms): 14608.6 | learning rate: 1.689E-05 | global batch size: 32 | lm loss: 6.582268E+00 | loss scale: 32768.0 | grad norm: 114214.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3383/ 159576 | consumed samples: 61008 | elapsed time per iteration (ms): 14527.6 | learning rate: 1.690E-05 | global batch size: 32 | lm loss: 6.577205E+00 | loss scale: 32768.0 | grad norm: 122437.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3384/ 159576 | consumed samples: 61040 | elapsed time per iteration (ms): 14914.6 | learning rate: 1.691E-05 | global batch size: 32 | lm loss: 6.428950E+00 | loss scale: 32768.0 | grad norm: 125848.491 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3385/ 159576 | consumed samples: 61072 | elapsed time per iteration (ms): 14662.1 | learning rate: 1.692E-05 | global batch size: 32 | lm loss: 6.677817E+00 | loss scale: 32768.0 | grad norm: 110496.306 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3386/ 159576 | consumed samples: 61104 | elapsed time per iteration (ms): 14566.3 | learning rate: 1.693E-05 | global batch size: 32 | lm loss: 6.704777E+00 | loss scale: 32768.0 | grad norm: 128540.385 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3387/ 159576 | consumed samples: 61136 | elapsed time per iteration (ms): 14563.5 | learning rate: 1.694E-05 | global batch size: 32 | lm loss: 6.578674E+00 | loss scale: 32768.0 | grad norm: 143780.108 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3388/ 159576 | consumed samples: 61168 | elapsed time per iteration (ms): 14890.7 | learning rate: 1.695E-05 | global batch size: 32 | lm loss: 6.503931E+00 | loss scale: 32768.0 | grad norm: 144574.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3389/ 159576 | consumed samples: 61200 | elapsed time per iteration (ms): 14672.5 | learning rate: 1.696E-05 | global batch size: 32 | lm loss: 6.662019E+00 | loss scale: 32768.0 | grad norm: 158358.181 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3390/ 159576 | consumed samples: 61232 | elapsed time per iteration (ms): 14563.8 | learning rate: 1.697E-05 | global batch size: 32 | lm loss: 6.577336E+00 | loss scale: 32768.0 | grad norm: 198110.226 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3391/ 159576 | consumed samples: 61264 | elapsed time per iteration (ms): 14556.6 | learning rate: 1.697E-05 | global batch size: 32 | lm loss: 6.480102E+00 | loss scale: 32768.0 | grad norm: 131120.843 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3392/ 159576 | consumed samples: 61296 | elapsed time per iteration (ms): 14679.5 | learning rate: 1.698E-05 | global batch size: 32 | lm loss: 6.610832E+00 | loss scale: 32768.0 | grad norm: 164581.156 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3393/ 159576 | consumed samples: 61328 | elapsed time per iteration (ms): 14940.6 | learning rate: 1.699E-05 | global batch size: 32 | lm loss: 6.591301E+00 | loss scale: 32768.0 | grad norm: 109544.075 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3394/ 159576 | consumed samples: 61360 | elapsed time per iteration (ms): 14592.5 | learning rate: 1.700E-05 | global batch size: 32 | lm loss: 6.572402E+00 | loss scale: 32768.0 | grad norm: 121937.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3395/ 159576 | consumed samples: 61392 | elapsed time per iteration (ms): 14696.4 | learning rate: 1.701E-05 | global batch size: 32 | lm loss: 6.509333E+00 | loss scale: 32768.0 | grad norm: 125128.401 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3396/ 159576 | consumed samples: 61424 | elapsed time per iteration (ms): 14508.0 | learning rate: 1.702E-05 | global batch size: 32 | lm loss: 6.481079E+00 | loss scale: 32768.0 | grad norm: 111910.670 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3397/ 159576 | consumed samples: 61456 | elapsed time per iteration (ms): 14790.4 | learning rate: 1.703E-05 | global batch size: 32 | lm loss: 6.548109E+00 | loss scale: 32768.0 | grad norm: 98717.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3398/ 159576 | consumed samples: 61488 | elapsed time per iteration (ms): 14622.0 | learning rate: 1.704E-05 | global batch size: 32 | lm loss: 6.769459E+00 | loss scale: 32768.0 | grad norm: 117754.948 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3399/ 159576 | consumed samples: 61520 | elapsed time per iteration (ms): 14611.9 | learning rate: 1.705E-05 | global batch size: 32 | lm loss: 6.555518E+00 | loss scale: 32768.0 | grad norm: 122435.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3400/ 159576 | consumed samples: 61552 | elapsed time per iteration (ms): 14673.6 | learning rate: 1.705E-05 | global batch size: 32 | lm loss: 6.464739E+00 | loss scale: 32768.0 | grad norm: 119112.548 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3401/ 159576 | consumed samples: 61584 | elapsed time per iteration (ms): 14910.7 | learning rate: 1.706E-05 | global batch size: 32 | lm loss: 6.473111E+00 | loss scale: 32768.0 | grad norm: 113410.819 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3402/ 159576 | consumed samples: 61616 | elapsed time per iteration (ms): 14645.2 | learning rate: 1.707E-05 | global batch size: 32 | lm loss: 6.476302E+00 | loss scale: 32768.0 | grad norm: 113730.379 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3403/ 159576 | consumed samples: 61648 | elapsed time per iteration (ms): 14580.6 | learning rate: 1.708E-05 | global batch size: 32 | lm loss: 6.449226E+00 | loss scale: 32768.0 | grad norm: 82819.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3404/ 159576 | consumed samples: 61680 | elapsed time per iteration (ms): 14600.7 | learning rate: 1.709E-05 | global batch size: 32 | lm loss: 6.560233E+00 | loss scale: 32768.0 | grad norm: 134696.405 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3405/ 159576 | consumed samples: 61712 | elapsed time per iteration (ms): 14772.7 | learning rate: 1.710E-05 | global batch size: 32 | lm loss: 6.546908E+00 | loss scale: 32768.0 | grad norm: 101163.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3406/ 159576 | consumed samples: 61744 | elapsed time per iteration (ms): 14593.3 | learning rate: 1.711E-05 | global batch size: 32 | lm loss: 6.541033E+00 | loss scale: 32768.0 | grad norm: 109699.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3407/ 159576 | consumed samples: 61776 | elapsed time per iteration (ms): 14624.0 | learning rate: 1.712E-05 | global batch size: 32 | lm loss: 6.511957E+00 | loss scale: 32768.0 | grad norm: 91123.954 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3408/ 159576 | consumed samples: 61808 | elapsed time per iteration (ms): 14724.5 | learning rate: 1.713E-05 | global batch size: 32 | lm loss: 6.628172E+00 | loss scale: 32768.0 | grad norm: 121584.252 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3409/ 159576 | consumed samples: 61840 | elapsed time per iteration (ms): 15120.6 | learning rate: 1.713E-05 | global batch size: 32 | lm loss: 6.578444E+00 | loss scale: 32768.0 | grad norm: 116757.586 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3410/ 159576 | consumed samples: 61872 | elapsed time per iteration (ms): 14619.5 | learning rate: 1.714E-05 | global batch size: 32 | lm loss: 6.415488E+00 | loss scale: 32768.0 | grad norm: 105815.444 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3411/ 159576 | consumed samples: 61904 | elapsed time per iteration (ms): 14577.8 | learning rate: 1.715E-05 | global batch size: 32 | lm loss: 6.553544E+00 | loss scale: 32768.0 | grad norm: 104053.489 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3412/ 159576 | consumed samples: 61936 | elapsed time per iteration (ms): 14587.5 | learning rate: 1.716E-05 | global batch size: 32 | lm loss: 6.435183E+00 | loss scale: 32768.0 | grad norm: 101905.898 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3413/ 159576 | consumed samples: 61968 | elapsed time per iteration (ms): 14985.9 | learning rate: 1.717E-05 | global batch size: 32 | lm loss: 6.580218E+00 | loss scale: 32768.0 | grad norm: 142325.290 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3414/ 159576 | consumed samples: 62000 | elapsed time per iteration (ms): 14646.8 | learning rate: 1.718E-05 | global batch size: 32 | lm loss: 6.534802E+00 | loss scale: 32768.0 | grad norm: 109771.164 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3415/ 159576 | consumed samples: 62032 | elapsed time per iteration (ms): 14644.6 | learning rate: 1.719E-05 | global batch size: 32 | lm loss: 6.582119E+00 | loss scale: 32768.0 | grad norm: 192056.720 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3416/ 159576 | consumed samples: 62064 | elapsed time per iteration (ms): 14616.1 | learning rate: 1.720E-05 | global batch size: 32 | lm loss: 6.496407E+00 | loss scale: 32768.0 | grad norm: 118953.837 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3417/ 159576 | consumed samples: 62096 | elapsed time per iteration (ms): 15113.2 | learning rate: 1.721E-05 | global batch size: 32 | lm loss: 6.475505E+00 | loss scale: 32768.0 | grad norm: 173828.473 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3418/ 159576 | consumed samples: 62128 | elapsed time per iteration (ms): 14635.6 | learning rate: 1.721E-05 | global batch size: 32 | lm loss: 6.318462E+00 | loss scale: 32768.0 | grad norm: 147925.562 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3419/ 159576 | consumed samples: 62160 | elapsed time per iteration (ms): 14611.3 | learning rate: 1.722E-05 | global batch size: 32 | lm loss: 6.571759E+00 | loss scale: 32768.0 | grad norm: 112885.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3420/ 159576 | consumed samples: 62192 | elapsed time per iteration (ms): 14573.5 | learning rate: 1.723E-05 | global batch size: 32 | lm loss: 6.461047E+00 | loss scale: 32768.0 | grad norm: 135373.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3421/ 159576 | consumed samples: 62224 | elapsed time per iteration (ms): 14978.7 | learning rate: 1.724E-05 | global batch size: 32 | lm loss: 6.554849E+00 | loss scale: 32768.0 | grad norm: 162048.264 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3422/ 159576 | consumed samples: 62256 | elapsed time per iteration (ms): 14574.6 | learning rate: 1.725E-05 | global batch size: 32 | lm loss: 6.443440E+00 | loss scale: 32768.0 | grad norm: 103393.805 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3423/ 159576 | consumed samples: 62288 | elapsed time per iteration (ms): 14578.8 | learning rate: 1.726E-05 | global batch size: 32 | lm loss: 6.490220E+00 | loss scale: 32768.0 | grad norm: 217891.504 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3424/ 159576 | consumed samples: 62320 | elapsed time per iteration (ms): 14669.3 | learning rate: 1.727E-05 | global batch size: 32 | lm loss: 6.475744E+00 | loss scale: 32768.0 | grad norm: 132019.548 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3425/ 159576 | consumed samples: 62352 | elapsed time per iteration (ms): 15003.7 | learning rate: 1.728E-05 | global batch size: 32 | lm loss: 6.639316E+00 | loss scale: 32768.0 | grad norm: 118549.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3426/ 159576 | consumed samples: 62384 | elapsed time per iteration (ms): 14473.5 | learning rate: 1.729E-05 | global batch size: 32 | lm loss: 6.529860E+00 | loss scale: 32768.0 | grad norm: 110134.510 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3427/ 159576 | consumed samples: 62416 | elapsed time per iteration (ms): 14593.0 | learning rate: 1.729E-05 | global batch size: 32 | lm loss: 6.424025E+00 | loss scale: 32768.0 | grad norm: 96948.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3428/ 159576 | consumed samples: 62448 | elapsed time per iteration (ms): 14574.8 | learning rate: 1.730E-05 | global batch size: 32 | lm loss: 6.603945E+00 | loss scale: 32768.0 | grad norm: 108813.419 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3429/ 159576 | consumed samples: 62480 | elapsed time per iteration (ms): 14962.4 | learning rate: 1.731E-05 | global batch size: 32 | lm loss: 6.519920E+00 | loss scale: 32768.0 | grad norm: 120997.396 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3430/ 159576 | consumed samples: 62512 | elapsed time per iteration (ms): 14606.5 | learning rate: 1.732E-05 | global batch size: 32 | lm loss: 6.519583E+00 | loss scale: 32768.0 | grad norm: 102226.597 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3431/ 159576 | consumed samples: 62544 | elapsed time per iteration (ms): 14685.5 | learning rate: 1.733E-05 | global batch size: 32 | lm loss: 6.413152E+00 | loss scale: 32768.0 | grad norm: 146442.757 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3432/ 159576 | consumed samples: 62576 | elapsed time per iteration (ms): 14642.7 | learning rate: 1.734E-05 | global batch size: 32 | lm loss: 6.416885E+00 | loss scale: 32768.0 | grad norm: 106692.633 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3433/ 159576 | consumed samples: 62608 | elapsed time per iteration (ms): 14943.4 | learning rate: 1.735E-05 | global batch size: 32 | lm loss: 6.684166E+00 | loss scale: 32768.0 | grad norm: 122647.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3434/ 159576 | consumed samples: 62640 | elapsed time per iteration (ms): 14559.8 | learning rate: 1.736E-05 | global batch size: 32 | lm loss: 6.582661E+00 | loss scale: 32768.0 | grad norm: 143037.633 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3435/ 159576 | consumed samples: 62672 | elapsed time per iteration (ms): 14581.0 | learning rate: 1.737E-05 | global batch size: 32 | lm loss: 6.459047E+00 | loss scale: 32768.0 | grad norm: 139754.449 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3436/ 159576 | consumed samples: 62704 | elapsed time per iteration (ms): 14594.3 | learning rate: 1.737E-05 | global batch size: 32 | lm loss: 6.455495E+00 | loss scale: 32768.0 | grad norm: 199133.358 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3437/ 159576 | consumed samples: 62736 | elapsed time per iteration (ms): 14983.6 | learning rate: 1.738E-05 | global batch size: 32 | lm loss: 6.507184E+00 | loss scale: 32768.0 | grad norm: 193681.925 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3438/ 159576 | consumed samples: 62768 | elapsed time per iteration (ms): 14797.2 | learning rate: 1.739E-05 | global batch size: 32 | lm loss: 6.461359E+00 | loss scale: 32768.0 | grad norm: 132732.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3439/ 159576 | consumed samples: 62800 | elapsed time per iteration (ms): 14579.8 | learning rate: 1.740E-05 | global batch size: 32 | lm loss: 6.704415E+00 | loss scale: 32768.0 | grad norm: 113391.882 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3440/ 159576 | consumed samples: 62832 | elapsed time per iteration (ms): 14621.6 | learning rate: 1.741E-05 | global batch size: 32 | lm loss: 6.473897E+00 | loss scale: 32768.0 | grad norm: 120849.572 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3441/ 159576 | consumed samples: 62864 | elapsed time per iteration (ms): 14686.1 | learning rate: 1.742E-05 | global batch size: 32 | lm loss: 6.459955E+00 | loss scale: 32768.0 | grad norm: 128216.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3442/ 159576 | consumed samples: 62896 | elapsed time per iteration (ms): 14857.9 | learning rate: 1.743E-05 | global batch size: 32 | lm loss: 6.424060E+00 | loss scale: 32768.0 | grad norm: 102672.871 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3443/ 159576 | consumed samples: 62928 | elapsed time per iteration (ms): 14570.1 | learning rate: 1.744E-05 | global batch size: 32 | lm loss: 6.534360E+00 | loss scale: 32768.0 | grad norm: 184877.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3444/ 159576 | consumed samples: 62960 | elapsed time per iteration (ms): 14620.2 | learning rate: 1.745E-05 | global batch size: 32 | lm loss: 6.629717E+00 | loss scale: 32768.0 | grad norm: 138408.073 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3445/ 159576 | consumed samples: 62992 | elapsed time per iteration (ms): 14619.1 | learning rate: 1.745E-05 | global batch size: 32 | lm loss: 6.494986E+00 | loss scale: 32768.0 | grad norm: 131634.897 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3446/ 159576 | consumed samples: 63024 | elapsed time per iteration (ms): 14739.8 | learning rate: 1.746E-05 | global batch size: 32 | lm loss: 6.529834E+00 | loss scale: 32768.0 | grad norm: 190204.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3447/ 159576 | consumed samples: 63056 | elapsed time per iteration (ms): 14575.9 | learning rate: 1.747E-05 | global batch size: 32 | lm loss: 6.519164E+00 | loss scale: 32768.0 | grad norm: 190893.633 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3448/ 159576 | consumed samples: 63088 | elapsed time per iteration (ms): 14611.0 | learning rate: 1.748E-05 | global batch size: 32 | lm loss: 6.431557E+00 | loss scale: 32768.0 | grad norm: 127326.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3449/ 159576 | consumed samples: 63120 | elapsed time per iteration (ms): 14615.1 | learning rate: 1.749E-05 | global batch size: 32 | lm loss: 6.213955E+00 | loss scale: 32768.0 | grad norm: 149485.955 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3450/ 159576 | consumed samples: 63152 | elapsed time per iteration (ms): 14697.2 | learning rate: 1.750E-05 | global batch size: 32 | lm loss: 6.669972E+00 | loss scale: 32768.0 | grad norm: 121418.512 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3451/ 159576 | consumed samples: 63184 | elapsed time per iteration (ms): 14506.2 | learning rate: 1.751E-05 | global batch size: 32 | lm loss: 6.538607E+00 | loss scale: 32768.0 | grad norm: 160228.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3452/ 159576 | consumed samples: 63216 | elapsed time per iteration (ms): 14518.4 | learning rate: 1.752E-05 | global batch size: 32 | lm loss: 6.466623E+00 | loss scale: 32768.0 | grad norm: 132558.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3453/ 159576 | consumed samples: 63248 | elapsed time per iteration (ms): 14654.4 | learning rate: 1.753E-05 | global batch size: 32 | lm loss: 6.575057E+00 | loss scale: 32768.0 | grad norm: 126715.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3454/ 159576 | consumed samples: 63280 | elapsed time per iteration (ms): 14975.6 | learning rate: 1.753E-05 | global batch size: 32 | lm loss: 6.469002E+00 | loss scale: 32768.0 | grad norm: 134315.470 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3455/ 159576 | consumed samples: 63312 | elapsed time per iteration (ms): 14595.3 | learning rate: 1.754E-05 | global batch size: 32 | lm loss: 6.471159E+00 | loss scale: 32768.0 | grad norm: 132183.538 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3456/ 159576 | consumed samples: 63344 | elapsed time per iteration (ms): 14624.6 | learning rate: 1.755E-05 | global batch size: 32 | lm loss: 6.390759E+00 | loss scale: 32768.0 | grad norm: 168993.753 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3457/ 159576 | consumed samples: 63376 | elapsed time per iteration (ms): 14611.9 | learning rate: 1.756E-05 | global batch size: 32 | lm loss: 6.545074E+00 | loss scale: 32768.0 | grad norm: 116907.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3458/ 159576 | consumed samples: 63408 | elapsed time per iteration (ms): 14991.7 | learning rate: 1.757E-05 | global batch size: 32 | lm loss: 6.541002E+00 | loss scale: 32768.0 | grad norm: 144421.845 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3459/ 159576 | consumed samples: 63440 | elapsed time per iteration (ms): 14690.5 | learning rate: 1.758E-05 | global batch size: 32 | lm loss: 6.549660E+00 | loss scale: 32768.0 | grad norm: 177618.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3460/ 159576 | consumed samples: 63472 | elapsed time per iteration (ms): 14572.5 | learning rate: 1.759E-05 | global batch size: 32 | lm loss: 6.509130E+00 | loss scale: 32768.0 | grad norm: 102216.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3461/ 159576 | consumed samples: 63504 | elapsed time per iteration (ms): 14630.9 | learning rate: 1.760E-05 | global batch size: 32 | lm loss: 6.474805E+00 | loss scale: 32768.0 | grad norm: 198903.879 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3462/ 159576 | consumed samples: 63536 | elapsed time per iteration (ms): 14903.4 | learning rate: 1.761E-05 | global batch size: 32 | lm loss: 6.343786E+00 | loss scale: 32768.0 | grad norm: 142714.038 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3463/ 159576 | consumed samples: 63568 | elapsed time per iteration (ms): 14638.9 | learning rate: 1.761E-05 | global batch size: 32 | lm loss: 6.644784E+00 | loss scale: 32768.0 | grad norm: 158591.280 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3464/ 159576 | consumed samples: 63600 | elapsed time per iteration (ms): 14613.0 | learning rate: 1.762E-05 | global batch size: 32 | lm loss: 6.625895E+00 | loss scale: 32768.0 | grad norm: 123320.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3465/ 159576 | consumed samples: 63632 | elapsed time per iteration (ms): 14585.1 | learning rate: 1.763E-05 | global batch size: 32 | lm loss: 6.575481E+00 | loss scale: 32768.0 | grad norm: 175492.554 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3466/ 159576 | consumed samples: 63664 | elapsed time per iteration (ms): 15007.9 | learning rate: 1.764E-05 | global batch size: 32 | lm loss: 6.510527E+00 | loss scale: 32768.0 | grad norm: 141462.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3467/ 159576 | consumed samples: 63696 | elapsed time per iteration (ms): 14658.4 | learning rate: 1.765E-05 | global batch size: 32 | lm loss: 6.281921E+00 | loss scale: 32768.0 | grad norm: 133404.006 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3468/ 159576 | consumed samples: 63728 | elapsed time per iteration (ms): 14580.1 | learning rate: 1.766E-05 | global batch size: 32 | lm loss: 6.438425E+00 | loss scale: 32768.0 | grad norm: 155340.501 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3469/ 159576 | consumed samples: 63760 | elapsed time per iteration (ms): 14575.6 | learning rate: 1.767E-05 | global batch size: 32 | lm loss: 6.527649E+00 | loss scale: 32768.0 | grad norm: 99587.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3470/ 159576 | consumed samples: 63792 | elapsed time per iteration (ms): 14895.6 | learning rate: 1.768E-05 | global batch size: 32 | lm loss: 6.196751E+00 | loss scale: 32768.0 | grad norm: 208702.232 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3471/ 159576 | consumed samples: 63824 | elapsed time per iteration (ms): 14601.7 | learning rate: 1.768E-05 | global batch size: 32 | lm loss: 6.487125E+00 | loss scale: 32768.0 | grad norm: 168900.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3472/ 159576 | consumed samples: 63856 | elapsed time per iteration (ms): 14566.0 | learning rate: 1.769E-05 | global batch size: 32 | lm loss: 6.509688E+00 | loss scale: 32768.0 | grad norm: 154921.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3473/ 159576 | consumed samples: 63888 | elapsed time per iteration (ms): 14575.1 | learning rate: 1.770E-05 | global batch size: 32 | lm loss: 6.622843E+00 | loss scale: 32768.0 | grad norm: 140472.596 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3474/ 159576 | consumed samples: 63920 | elapsed time per iteration (ms): 14877.5 | learning rate: 1.771E-05 | global batch size: 32 | lm loss: 6.475362E+00 | loss scale: 32768.0 | grad norm: 119718.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3475/ 159576 | consumed samples: 63952 | elapsed time per iteration (ms): 14552.0 | learning rate: 1.772E-05 | global batch size: 32 | lm loss: 6.465285E+00 | loss scale: 32768.0 | grad norm: 172671.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3476/ 159576 | consumed samples: 63984 | elapsed time per iteration (ms): 14582.7 | learning rate: 1.773E-05 | global batch size: 32 | lm loss: 6.389154E+00 | loss scale: 32768.0 | grad norm: 113417.369 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3477/ 159576 | consumed samples: 64016 | elapsed time per iteration (ms): 14606.6 | learning rate: 1.774E-05 | global batch size: 32 | lm loss: 6.582153E+00 | loss scale: 32768.0 | grad norm: 139244.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3478/ 159576 | consumed samples: 64048 | elapsed time per iteration (ms): 14915.2 | learning rate: 1.775E-05 | global batch size: 32 | lm loss: 6.490180E+00 | loss scale: 32768.0 | grad norm: 94281.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3479/ 159576 | consumed samples: 64080 | elapsed time per iteration (ms): 14555.1 | learning rate: 1.776E-05 | global batch size: 32 | lm loss: 6.683810E+00 | loss scale: 32768.0 | grad norm: 149137.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3480/ 159576 | consumed samples: 64112 | elapsed time per iteration (ms): 14553.1 | learning rate: 1.776E-05 | global batch size: 32 | lm loss: 6.534214E+00 | loss scale: 32768.0 | grad norm: 129169.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3481/ 159576 | consumed samples: 64144 | elapsed time per iteration (ms): 14603.3 | learning rate: 1.777E-05 | global batch size: 32 | lm loss: 6.581446E+00 | loss scale: 32768.0 | grad norm: 115991.644 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3482/ 159576 | consumed samples: 64176 | elapsed time per iteration (ms): 14916.9 | learning rate: 1.778E-05 | global batch size: 32 | lm loss: 6.567008E+00 | loss scale: 32768.0 | grad norm: 184960.532 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3483/ 159576 | consumed samples: 64208 | elapsed time per iteration (ms): 14481.2 | learning rate: 1.779E-05 | global batch size: 32 | lm loss: 6.662760E+00 | loss scale: 32768.0 | grad norm: 134077.108 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3484/ 159576 | consumed samples: 64240 | elapsed time per iteration (ms): 14567.5 | learning rate: 1.780E-05 | global batch size: 32 | lm loss: 6.589795E+00 | loss scale: 32768.0 | grad norm: 126611.070 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3485/ 159576 | consumed samples: 64272 | elapsed time per iteration (ms): 14495.3 | learning rate: 1.781E-05 | global batch size: 32 | lm loss: 6.497936E+00 | loss scale: 32768.0 | grad norm: 122115.644 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3486/ 159576 | consumed samples: 64304 | elapsed time per iteration (ms): 14568.8 | learning rate: 1.782E-05 | global batch size: 32 | lm loss: 6.558665E+00 | loss scale: 32768.0 | grad norm: 126373.837 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3487/ 159576 | consumed samples: 64336 | elapsed time per iteration (ms): 14913.4 | learning rate: 1.783E-05 | global batch size: 32 | lm loss: 6.431637E+00 | loss scale: 32768.0 | grad norm: 161636.464 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3488/ 159576 | consumed samples: 64368 | elapsed time per iteration (ms): 14528.7 | learning rate: 1.784E-05 | global batch size: 32 | lm loss: 6.356628E+00 | loss scale: 32768.0 | grad norm: 114700.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3489/ 159576 | consumed samples: 64400 | elapsed time per iteration (ms): 14522.5 | learning rate: 1.784E-05 | global batch size: 32 | lm loss: 6.470509E+00 | loss scale: 32768.0 | grad norm: 157358.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3490/ 159576 | consumed samples: 64432 | elapsed time per iteration (ms): 14512.2 | learning rate: 1.785E-05 | global batch size: 32 | lm loss: 6.580731E+00 | loss scale: 32768.0 | grad norm: 124839.092 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3491/ 159576 | consumed samples: 64464 | elapsed time per iteration (ms): 14760.8 | learning rate: 1.786E-05 | global batch size: 32 | lm loss: 6.545910E+00 | loss scale: 32768.0 | grad norm: 225734.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3492/ 159576 | consumed samples: 64496 | elapsed time per iteration (ms): 14465.1 | learning rate: 1.787E-05 | global batch size: 32 | lm loss: 6.462240E+00 | loss scale: 32768.0 | grad norm: 157153.606 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3493/ 159576 | consumed samples: 64528 | elapsed time per iteration (ms): 14555.7 | learning rate: 1.788E-05 | global batch size: 32 | lm loss: 6.526244E+00 | loss scale: 32768.0 | grad norm: 134834.105 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3494/ 159576 | consumed samples: 64560 | elapsed time per iteration (ms): 14523.5 | learning rate: 1.789E-05 | global batch size: 32 | lm loss: 6.464767E+00 | loss scale: 32768.0 | grad norm: 111080.299 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3495/ 159576 | consumed samples: 64592 | elapsed time per iteration (ms): 14680.5 | learning rate: 1.790E-05 | global batch size: 32 | lm loss: 6.498696E+00 | loss scale: 32768.0 | grad norm: 149926.493 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3496/ 159576 | consumed samples: 64624 | elapsed time per iteration (ms): 14537.6 | learning rate: 1.791E-05 | global batch size: 32 | lm loss: 6.801207E+00 | loss scale: 32768.0 | grad norm: 169978.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3497/ 159576 | consumed samples: 64656 | elapsed time per iteration (ms): 14576.8 | learning rate: 1.792E-05 | global batch size: 32 | lm loss: 6.458578E+00 | loss scale: 32768.0 | grad norm: 128624.834 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3498/ 159576 | consumed samples: 64688 | elapsed time per iteration (ms): 14451.0 | learning rate: 1.792E-05 | global batch size: 32 | lm loss: 6.562904E+00 | loss scale: 32768.0 | grad norm: 201818.910 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3499/ 159576 | consumed samples: 64720 | elapsed time per iteration (ms): 14843.4 | learning rate: 1.793E-05 | global batch size: 32 | lm loss: 6.620703E+00 | loss scale: 32768.0 | grad norm: 136369.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3500/ 159576 | consumed samples: 64752 | elapsed time per iteration (ms): 14591.5 | learning rate: 1.794E-05 | global batch size: 32 | lm loss: 6.545550E+00 | loss scale: 32768.0 | grad norm: 169642.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3501/ 159576 | consumed samples: 64784 | elapsed time per iteration (ms): 14557.9 | learning rate: 1.795E-05 | global batch size: 32 | lm loss: 6.401666E+00 | loss scale: 32768.0 | grad norm: 152333.231 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3502/ 159576 | consumed samples: 64816 | elapsed time per iteration (ms): 14554.3 | learning rate: 1.796E-05 | global batch size: 32 | lm loss: 6.776519E+00 | loss scale: 32768.0 | grad norm: 234394.263 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3503/ 159576 | consumed samples: 64848 | elapsed time per iteration (ms): 14868.0 | learning rate: 1.797E-05 | global batch size: 32 | lm loss: 6.465873E+00 | loss scale: 32768.0 | grad norm: 117665.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3504/ 159576 | consumed samples: 64880 | elapsed time per iteration (ms): 14552.4 | learning rate: 1.798E-05 | global batch size: 32 | lm loss: 6.534934E+00 | loss scale: 32768.0 | grad norm: 205418.453 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3505/ 159576 | consumed samples: 64912 | elapsed time per iteration (ms): 14532.4 | learning rate: 1.799E-05 | global batch size: 32 | lm loss: 6.777419E+00 | loss scale: 32768.0 | grad norm: 156642.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3506/ 159576 | consumed samples: 64944 | elapsed time per iteration (ms): 14549.9 | learning rate: 1.800E-05 | global batch size: 32 | lm loss: 6.528007E+00 | loss scale: 32768.0 | grad norm: 168324.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3507/ 159576 | consumed samples: 64976 | elapsed time per iteration (ms): 14947.6 | learning rate: 1.800E-05 | global batch size: 32 | lm loss: 6.669527E+00 | loss scale: 32768.0 | grad norm: 116164.306 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3508/ 159576 | consumed samples: 65008 | elapsed time per iteration (ms): 14485.1 | learning rate: 1.801E-05 | global batch size: 32 | lm loss: 6.649974E+00 | loss scale: 32768.0 | grad norm: 195968.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3509/ 159576 | consumed samples: 65040 | elapsed time per iteration (ms): 14549.4 | learning rate: 1.802E-05 | global batch size: 32 | lm loss: 6.636446E+00 | loss scale: 32768.0 | grad norm: 135969.732 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3510/ 159576 | consumed samples: 65072 | elapsed time per iteration (ms): 14546.9 | learning rate: 1.803E-05 | global batch size: 32 | lm loss: 6.529005E+00 | loss scale: 32768.0 | grad norm: 225903.317 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3511/ 159576 | consumed samples: 65104 | elapsed time per iteration (ms): 14847.8 | learning rate: 1.804E-05 | global batch size: 32 | lm loss: 6.629415E+00 | loss scale: 32768.0 | grad norm: 130652.559 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3512/ 159576 | consumed samples: 65136 | elapsed time per iteration (ms): 14520.0 | learning rate: 1.805E-05 | global batch size: 32 | lm loss: 6.599288E+00 | loss scale: 32768.0 | grad norm: 149863.059 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3513/ 159576 | consumed samples: 65168 | elapsed time per iteration (ms): 14651.1 | learning rate: 1.806E-05 | global batch size: 32 | lm loss: 6.592654E+00 | loss scale: 32768.0 | grad norm: 166996.968 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3514/ 159576 | consumed samples: 65200 | elapsed time per iteration (ms): 14479.3 | learning rate: 1.807E-05 | global batch size: 32 | lm loss: 6.540200E+00 | loss scale: 32768.0 | grad norm: 115498.690 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3515/ 159576 | consumed samples: 65232 | elapsed time per iteration (ms): 14930.0 | learning rate: 1.808E-05 | global batch size: 32 | lm loss: 6.488201E+00 | loss scale: 32768.0 | grad norm: 217689.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3516/ 159576 | consumed samples: 65264 | elapsed time per iteration (ms): 14459.8 | learning rate: 1.808E-05 | global batch size: 32 | lm loss: 6.478746E+00 | loss scale: 32768.0 | grad norm: 131460.444 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3517/ 159576 | consumed samples: 65296 | elapsed time per iteration (ms): 14524.9 | learning rate: 1.809E-05 | global batch size: 32 | lm loss: 6.658568E+00 | loss scale: 32768.0 | grad norm: 186540.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3518/ 159576 | consumed samples: 65328 | elapsed time per iteration (ms): 14525.2 | learning rate: 1.810E-05 | global batch size: 32 | lm loss: 6.641760E+00 | loss scale: 32768.0 | grad norm: 215453.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3519/ 159576 | consumed samples: 65360 | elapsed time per iteration (ms): 14903.9 | learning rate: 1.811E-05 | global batch size: 32 | lm loss: 6.578794E+00 | loss scale: 32768.0 | grad norm: 129785.760 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3520/ 159576 | consumed samples: 65392 | elapsed time per iteration (ms): 14710.5 | learning rate: 1.812E-05 | global batch size: 32 | lm loss: 6.623507E+00 | loss scale: 32768.0 | grad norm: 120935.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3521/ 159576 | consumed samples: 65424 | elapsed time per iteration (ms): 14520.7 | learning rate: 1.813E-05 | global batch size: 32 | lm loss: 6.597843E+00 | loss scale: 32768.0 | grad norm: 116244.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3522/ 159576 | consumed samples: 65456 | elapsed time per iteration (ms): 14597.0 | learning rate: 1.814E-05 | global batch size: 32 | lm loss: 6.504926E+00 | loss scale: 32768.0 | grad norm: 134767.376 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3523/ 159576 | consumed samples: 65488 | elapsed time per iteration (ms): 14942.9 | learning rate: 1.815E-05 | global batch size: 32 | lm loss: 6.435289E+00 | loss scale: 32768.0 | grad norm: 86682.164 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3524/ 159576 | consumed samples: 65520 | elapsed time per iteration (ms): 14654.2 | learning rate: 1.816E-05 | global batch size: 32 | lm loss: 6.594196E+00 | loss scale: 32768.0 | grad norm: 134027.315 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3525/ 159576 | consumed samples: 65552 | elapsed time per iteration (ms): 14562.7 | learning rate: 1.816E-05 | global batch size: 32 | lm loss: 6.679243E+00 | loss scale: 32768.0 | grad norm: 125221.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3526/ 159576 | consumed samples: 65584 | elapsed time per iteration (ms): 14630.7 | learning rate: 1.817E-05 | global batch size: 32 | lm loss: 6.456674E+00 | loss scale: 32768.0 | grad norm: 86112.712 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3527/ 159576 | consumed samples: 65616 | elapsed time per iteration (ms): 14493.8 | learning rate: 1.818E-05 | global batch size: 32 | lm loss: 6.600234E+00 | loss scale: 32768.0 | grad norm: 300729.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3528/ 159576 | consumed samples: 65648 | elapsed time per iteration (ms): 14813.0 | learning rate: 1.819E-05 | global batch size: 32 | lm loss: 6.399897E+00 | loss scale: 32768.0 | grad norm: 153878.237 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3529/ 159576 | consumed samples: 65680 | elapsed time per iteration (ms): 14593.6 | learning rate: 1.820E-05 | global batch size: 32 | lm loss: 6.540657E+00 | loss scale: 32768.0 | grad norm: 150860.243 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3530/ 159576 | consumed samples: 65712 | elapsed time per iteration (ms): 14559.8 | learning rate: 1.821E-05 | global batch size: 32 | lm loss: 6.503862E+00 | loss scale: 32768.0 | grad norm: 149193.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3531/ 159576 | consumed samples: 65744 | elapsed time per iteration (ms): 14581.4 | learning rate: 1.822E-05 | global batch size: 32 | lm loss: 6.692787E+00 | loss scale: 32768.0 | grad norm: 207812.798 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3532/ 159576 | consumed samples: 65776 | elapsed time per iteration (ms): 14715.5 | learning rate: 1.823E-05 | global batch size: 32 | lm loss: 6.484317E+00 | loss scale: 32768.0 | grad norm: 161092.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3533/ 159576 | consumed samples: 65808 | elapsed time per iteration (ms): 14610.9 | learning rate: 1.824E-05 | global batch size: 32 | lm loss: 6.475138E+00 | loss scale: 32768.0 | grad norm: 155421.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3534/ 159576 | consumed samples: 65840 | elapsed time per iteration (ms): 14445.3 | learning rate: 1.824E-05 | global batch size: 32 | lm loss: 6.511703E+00 | loss scale: 32768.0 | grad norm: 114681.720 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3535/ 159576 | consumed samples: 65872 | elapsed time per iteration (ms): 14477.9 | learning rate: 1.825E-05 | global batch size: 32 | lm loss: 6.509159E+00 | loss scale: 32768.0 | grad norm: 183050.824 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3536/ 159576 | consumed samples: 65904 | elapsed time per iteration (ms): 14816.2 | learning rate: 1.826E-05 | global batch size: 32 | lm loss: 6.497670E+00 | loss scale: 32768.0 | grad norm: 96091.191 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3537/ 159576 | consumed samples: 65936 | elapsed time per iteration (ms): 14439.5 | learning rate: 1.827E-05 | global batch size: 32 | lm loss: 6.505747E+00 | loss scale: 32768.0 | grad norm: 140156.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3538/ 159576 | consumed samples: 65968 | elapsed time per iteration (ms): 14594.1 | learning rate: 1.828E-05 | global batch size: 32 | lm loss: 6.516546E+00 | loss scale: 32768.0 | grad norm: 97276.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3539/ 159576 | consumed samples: 66000 | elapsed time per iteration (ms): 14531.0 | learning rate: 1.829E-05 | global batch size: 32 | lm loss: 6.589782E+00 | loss scale: 32768.0 | grad norm: 283362.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3540/ 159576 | consumed samples: 66032 | elapsed time per iteration (ms): 14766.1 | learning rate: 1.830E-05 | global batch size: 32 | lm loss: 6.457118E+00 | loss scale: 32768.0 | grad norm: 119093.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3541/ 159576 | consumed samples: 66064 | elapsed time per iteration (ms): 14538.8 | learning rate: 1.831E-05 | global batch size: 32 | lm loss: 6.543458E+00 | loss scale: 32768.0 | grad norm: 143270.575 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3542/ 159576 | consumed samples: 66096 | elapsed time per iteration (ms): 14503.8 | learning rate: 1.832E-05 | global batch size: 32 | lm loss: 6.549830E+00 | loss scale: 32768.0 | grad norm: 146934.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3543/ 159576 | consumed samples: 66128 | elapsed time per iteration (ms): 14525.1 | learning rate: 1.832E-05 | global batch size: 32 | lm loss: 6.523373E+00 | loss scale: 32768.0 | grad norm: 246079.782 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3544/ 159576 | consumed samples: 66160 | elapsed time per iteration (ms): 14836.5 | learning rate: 1.833E-05 | global batch size: 32 | lm loss: 6.484323E+00 | loss scale: 32768.0 | grad norm: 150473.482 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3545/ 159576 | consumed samples: 66192 | elapsed time per iteration (ms): 14612.1 | learning rate: 1.834E-05 | global batch size: 32 | lm loss: 6.596731E+00 | loss scale: 32768.0 | grad norm: 157995.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3546/ 159576 | consumed samples: 66224 | elapsed time per iteration (ms): 14518.2 | learning rate: 1.835E-05 | global batch size: 32 | lm loss: 6.564546E+00 | loss scale: 32768.0 | grad norm: 164874.723 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3547/ 159576 | consumed samples: 66256 | elapsed time per iteration (ms): 14501.0 | learning rate: 1.836E-05 | global batch size: 32 | lm loss: 6.427078E+00 | loss scale: 32768.0 | grad norm: 175876.651 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3548/ 159576 | consumed samples: 66288 | elapsed time per iteration (ms): 14899.9 | learning rate: 1.837E-05 | global batch size: 32 | lm loss: 6.488606E+00 | loss scale: 32768.0 | grad norm: 198886.829 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3549/ 159576 | consumed samples: 66320 | elapsed time per iteration (ms): 14520.6 | learning rate: 1.838E-05 | global batch size: 32 | lm loss: 6.462682E+00 | loss scale: 32768.0 | grad norm: 127675.702 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3550/ 159576 | consumed samples: 66352 | elapsed time per iteration (ms): 14447.8 | learning rate: 1.839E-05 | global batch size: 32 | lm loss: 6.652044E+00 | loss scale: 32768.0 | grad norm: 140944.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3551/ 159576 | consumed samples: 66384 | elapsed time per iteration (ms): 14467.2 | learning rate: 1.839E-05 | global batch size: 32 | lm loss: 6.520955E+00 | loss scale: 32768.0 | grad norm: 86094.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3552/ 159576 | consumed samples: 66416 | elapsed time per iteration (ms): 14808.2 | learning rate: 1.840E-05 | global batch size: 32 | lm loss: 6.429432E+00 | loss scale: 32768.0 | grad norm: 116647.112 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3553/ 159576 | consumed samples: 66448 | elapsed time per iteration (ms): 14503.5 | learning rate: 1.841E-05 | global batch size: 32 | lm loss: 6.463936E+00 | loss scale: 32768.0 | grad norm: 118564.730 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3554/ 159576 | consumed samples: 66480 | elapsed time per iteration (ms): 14502.1 | learning rate: 1.842E-05 | global batch size: 32 | lm loss: 6.458220E+00 | loss scale: 32768.0 | grad norm: 112013.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3555/ 159576 | consumed samples: 66512 | elapsed time per iteration (ms): 14486.2 | learning rate: 1.843E-05 | global batch size: 32 | lm loss: 6.492205E+00 | loss scale: 32768.0 | grad norm: 95075.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3556/ 159576 | consumed samples: 66544 | elapsed time per iteration (ms): 14873.1 | learning rate: 1.844E-05 | global batch size: 32 | lm loss: 6.582590E+00 | loss scale: 32768.0 | grad norm: 160024.973 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3557/ 159576 | consumed samples: 66576 | elapsed time per iteration (ms): 14487.7 | learning rate: 1.845E-05 | global batch size: 32 | lm loss: 6.504139E+00 | loss scale: 32768.0 | grad norm: 102536.359 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3558/ 159576 | consumed samples: 66608 | elapsed time per iteration (ms): 14571.2 | learning rate: 1.846E-05 | global batch size: 32 | lm loss: 6.514203E+00 | loss scale: 32768.0 | grad norm: 221229.679 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3559/ 159576 | consumed samples: 66640 | elapsed time per iteration (ms): 14451.0 | learning rate: 1.847E-05 | global batch size: 32 | lm loss: 6.560319E+00 | loss scale: 32768.0 | grad norm: 131012.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3560/ 159576 | consumed samples: 66672 | elapsed time per iteration (ms): 14938.1 | learning rate: 1.847E-05 | global batch size: 32 | lm loss: 6.372297E+00 | loss scale: 32768.0 | grad norm: 139056.836 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3561/ 159576 | consumed samples: 66704 | elapsed time per iteration (ms): 14523.1 | learning rate: 1.848E-05 | global batch size: 32 | lm loss: 6.416655E+00 | loss scale: 32768.0 | grad norm: 147497.179 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3562/ 159576 | consumed samples: 66736 | elapsed time per iteration (ms): 14487.9 | learning rate: 1.849E-05 | global batch size: 32 | lm loss: 6.474949E+00 | loss scale: 32768.0 | grad norm: 174437.813 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3563/ 159576 | consumed samples: 66768 | elapsed time per iteration (ms): 14468.9 | learning rate: 1.850E-05 | global batch size: 32 | lm loss: 6.623423E+00 | loss scale: 32768.0 | grad norm: 122791.597 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3564/ 159576 | consumed samples: 66800 | elapsed time per iteration (ms): 14508.1 | learning rate: 1.851E-05 | global batch size: 32 | lm loss: 6.516719E+00 | loss scale: 32768.0 | grad norm: 125896.178 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3565/ 159576 | consumed samples: 66832 | elapsed time per iteration (ms): 14821.3 | learning rate: 1.852E-05 | global batch size: 32 | lm loss: 6.567136E+00 | loss scale: 32768.0 | grad norm: 156146.827 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3566/ 159576 | consumed samples: 66864 | elapsed time per iteration (ms): 14550.7 | learning rate: 1.853E-05 | global batch size: 32 | lm loss: 6.464426E+00 | loss scale: 32768.0 | grad norm: 112089.852 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3567/ 159576 | consumed samples: 66896 | elapsed time per iteration (ms): 14483.3 | learning rate: 1.854E-05 | global batch size: 32 | lm loss: 6.330031E+00 | loss scale: 32768.0 | grad norm: 100672.150 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3568/ 159576 | consumed samples: 66928 | elapsed time per iteration (ms): 14573.3 | learning rate: 1.855E-05 | global batch size: 32 | lm loss: 6.472744E+00 | loss scale: 32768.0 | grad norm: 206164.387 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3569/ 159576 | consumed samples: 66960 | elapsed time per iteration (ms): 14778.2 | learning rate: 1.855E-05 | global batch size: 32 | lm loss: 6.502261E+00 | loss scale: 32768.0 | grad norm: 117741.940 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3570/ 159576 | consumed samples: 66992 | elapsed time per iteration (ms): 14563.8 | learning rate: 1.856E-05 | global batch size: 32 | lm loss: 6.480472E+00 | loss scale: 32768.0 | grad norm: 180667.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3571/ 159576 | consumed samples: 67024 | elapsed time per iteration (ms): 14517.4 | learning rate: 1.857E-05 | global batch size: 32 | lm loss: 6.653479E+00 | loss scale: 32768.0 | grad norm: 121625.335 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3572/ 159576 | consumed samples: 67056 | elapsed time per iteration (ms): 14532.0 | learning rate: 1.858E-05 | global batch size: 32 | lm loss: 6.478413E+00 | loss scale: 32768.0 | grad norm: 135823.282 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3573/ 159576 | consumed samples: 67088 | elapsed time per iteration (ms): 14807.4 | learning rate: 1.859E-05 | global batch size: 32 | lm loss: 6.589501E+00 | loss scale: 32768.0 | grad norm: 147763.903 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3574/ 159576 | consumed samples: 67120 | elapsed time per iteration (ms): 14483.4 | learning rate: 1.860E-05 | global batch size: 32 | lm loss: 6.503617E+00 | loss scale: 32768.0 | grad norm: 85865.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3575/ 159576 | consumed samples: 67152 | elapsed time per iteration (ms): 14505.6 | learning rate: 1.861E-05 | global batch size: 32 | lm loss: 6.573061E+00 | loss scale: 32768.0 | grad norm: 180050.879 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3576/ 159576 | consumed samples: 67184 | elapsed time per iteration (ms): 14550.9 | learning rate: 1.862E-05 | global batch size: 32 | lm loss: 6.480776E+00 | loss scale: 32768.0 | grad norm: 122066.327 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3577/ 159576 | consumed samples: 67216 | elapsed time per iteration (ms): 14868.6 | learning rate: 1.863E-05 | global batch size: 32 | lm loss: 6.625753E+00 | loss scale: 32768.0 | grad norm: 166062.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3578/ 159576 | consumed samples: 67248 | elapsed time per iteration (ms): 14594.8 | learning rate: 1.863E-05 | global batch size: 32 | lm loss: 6.470201E+00 | loss scale: 32768.0 | grad norm: 158898.525 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 16:06:53] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 16:06:53] PULSE: tr8-104B is running for 10:14:42 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 3579/ 159576 | consumed samples: 67280 | elapsed time per iteration (ms): 14505.5 | learning rate: 1.864E-05 | global batch size: 32 | lm loss: 6.669123E+00 | loss scale: 32768.0 | grad norm: 114371.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3580/ 159576 | consumed samples: 67312 | elapsed time per iteration (ms): 14435.4 | learning rate: 1.865E-05 | global batch size: 32 | lm loss: 6.504656E+00 | loss scale: 32768.0 | grad norm: 143322.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3581/ 159576 | consumed samples: 67344 | elapsed time per iteration (ms): 14983.8 | learning rate: 1.866E-05 | global batch size: 32 | lm loss: 6.634960E+00 | loss scale: 32768.0 | grad norm: 124051.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3582/ 159576 | consumed samples: 67376 | elapsed time per iteration (ms): 14518.7 | learning rate: 1.867E-05 | global batch size: 32 | lm loss: 6.488723E+00 | loss scale: 32768.0 | grad norm: 108661.260 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3583/ 159576 | consumed samples: 67408 | elapsed time per iteration (ms): 14495.4 | learning rate: 1.868E-05 | global batch size: 32 | lm loss: 6.397575E+00 | loss scale: 32768.0 | grad norm: 156428.484 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3584/ 159576 | consumed samples: 67440 | elapsed time per iteration (ms): 14500.4 | learning rate: 1.869E-05 | global batch size: 32 | lm loss: 6.505555E+00 | loss scale: 32768.0 | grad norm: 158735.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3585/ 159576 | consumed samples: 67472 | elapsed time per iteration (ms): 14850.8 | learning rate: 1.870E-05 | global batch size: 32 | lm loss: 6.384704E+00 | loss scale: 32768.0 | grad norm: 121455.406 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3586/ 159576 | consumed samples: 67504 | elapsed time per iteration (ms): 14516.1 | learning rate: 1.871E-05 | global batch size: 32 | lm loss: 6.391223E+00 | loss scale: 32768.0 | grad norm: 200272.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3587/ 159576 | consumed samples: 67536 | elapsed time per iteration (ms): 14478.9 | learning rate: 1.871E-05 | global batch size: 32 | lm loss: 6.602296E+00 | loss scale: 32768.0 | grad norm: 156857.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3588/ 159576 | consumed samples: 67568 | elapsed time per iteration (ms): 14457.3 | learning rate: 1.872E-05 | global batch size: 32 | lm loss: 6.356599E+00 | loss scale: 32768.0 | grad norm: 132240.106 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3589/ 159576 | consumed samples: 67600 | elapsed time per iteration (ms): 14840.9 | learning rate: 1.873E-05 | global batch size: 32 | lm loss: 6.517581E+00 | loss scale: 32768.0 | grad norm: 101976.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3590/ 159576 | consumed samples: 67632 | elapsed time per iteration (ms): 14478.5 | learning rate: 1.874E-05 | global batch size: 32 | lm loss: 6.495076E+00 | loss scale: 32768.0 | grad norm: 145637.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3591/ 159576 | consumed samples: 67664 | elapsed time per iteration (ms): 14537.3 | learning rate: 1.875E-05 | global batch size: 32 | lm loss: 6.486649E+00 | loss scale: 32768.0 | grad norm: 110128.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3592/ 159576 | consumed samples: 67696 | elapsed time per iteration (ms): 14585.1 | learning rate: 1.876E-05 | global batch size: 32 | lm loss: 6.484485E+00 | loss scale: 32768.0 | grad norm: 93123.364 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3593/ 159576 | consumed samples: 67728 | elapsed time per iteration (ms): 14970.8 | learning rate: 1.877E-05 | global batch size: 32 | lm loss: 6.605970E+00 | loss scale: 32768.0 | grad norm: 196733.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3594/ 159576 | consumed samples: 67760 | elapsed time per iteration (ms): 14488.2 | learning rate: 1.878E-05 | global batch size: 32 | lm loss: 6.408032E+00 | loss scale: 32768.0 | grad norm: 119062.835 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3595/ 159576 | consumed samples: 67792 | elapsed time per iteration (ms): 14589.0 | learning rate: 1.879E-05 | global batch size: 32 | lm loss: 6.434669E+00 | loss scale: 32768.0 | grad norm: 163713.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3596/ 159576 | consumed samples: 67824 | elapsed time per iteration (ms): 14467.1 | learning rate: 1.879E-05 | global batch size: 32 | lm loss: 6.515763E+00 | loss scale: 32768.0 | grad norm: 123609.059 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3597/ 159576 | consumed samples: 67856 | elapsed time per iteration (ms): 14918.0 | learning rate: 1.880E-05 | global batch size: 32 | lm loss: 6.473671E+00 | loss scale: 32768.0 | grad norm: 113241.499 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3598/ 159576 | consumed samples: 67888 | elapsed time per iteration (ms): 14630.3 | learning rate: 1.881E-05 | global batch size: 32 | lm loss: 6.497471E+00 | loss scale: 32768.0 | grad norm: 180550.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3599/ 159576 | consumed samples: 67920 | elapsed time per iteration (ms): 14523.9 | learning rate: 1.882E-05 | global batch size: 32 | lm loss: 6.665214E+00 | loss scale: 32768.0 | grad norm: 120833.867 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3600/ 159576 | consumed samples: 67952 | elapsed time per iteration (ms): 14548.6 | learning rate: 1.883E-05 | global batch size: 32 | lm loss: 6.506467E+00 | loss scale: 32768.0 | grad norm: 124134.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3601/ 159576 | consumed samples: 67984 | elapsed time per iteration (ms): 14576.2 | learning rate: 1.884E-05 | global batch size: 32 | lm loss: 6.491764E+00 | loss scale: 32768.0 | grad norm: 230059.443 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3602/ 159576 | consumed samples: 68016 | elapsed time per iteration (ms): 14979.8 | learning rate: 1.885E-05 | global batch size: 32 | lm loss: 6.445697E+00 | loss scale: 32768.0 | grad norm: 125622.628 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3603/ 159576 | consumed samples: 68048 | elapsed time per iteration (ms): 14453.6 | learning rate: 1.886E-05 | global batch size: 32 | lm loss: 6.613330E+00 | loss scale: 32768.0 | grad norm: 166344.814 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3604/ 159576 | consumed samples: 68080 | elapsed time per iteration (ms): 14495.4 | learning rate: 1.887E-05 | global batch size: 32 | lm loss: 6.603212E+00 | loss scale: 32768.0 | grad norm: 93757.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3605/ 159576 | consumed samples: 68112 | elapsed time per iteration (ms): 14542.0 | learning rate: 1.887E-05 | global batch size: 32 | lm loss: 6.342390E+00 | loss scale: 32768.0 | grad norm: 130006.029 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3606/ 159576 | consumed samples: 68144 | elapsed time per iteration (ms): 14685.4 | learning rate: 1.888E-05 | global batch size: 32 | lm loss: 6.480408E+00 | loss scale: 32768.0 | grad norm: 106365.528 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3607/ 159576 | consumed samples: 68176 | elapsed time per iteration (ms): 14517.9 | learning rate: 1.889E-05 | global batch size: 32 | lm loss: 6.591272E+00 | loss scale: 32768.0 | grad norm: 171235.897 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3608/ 159576 | consumed samples: 68208 | elapsed time per iteration (ms): 14591.0 | learning rate: 1.890E-05 | global batch size: 32 | lm loss: 6.311239E+00 | loss scale: 32768.0 | grad norm: 126858.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3609/ 159576 | consumed samples: 68240 | elapsed time per iteration (ms): 14549.9 | learning rate: 1.891E-05 | global batch size: 32 | lm loss: 6.395494E+00 | loss scale: 32768.0 | grad norm: 227345.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3610/ 159576 | consumed samples: 68272 | elapsed time per iteration (ms): 14677.9 | learning rate: 1.892E-05 | global batch size: 32 | lm loss: 6.557859E+00 | loss scale: 32768.0 | grad norm: 116386.145 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3611/ 159576 | consumed samples: 68304 | elapsed time per iteration (ms): 14497.7 | learning rate: 1.893E-05 | global batch size: 32 | lm loss: 6.436782E+00 | loss scale: 32768.0 | grad norm: 130216.388 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3612/ 159576 | consumed samples: 68336 | elapsed time per iteration (ms): 14516.9 | learning rate: 1.894E-05 | global batch size: 32 | lm loss: 6.523721E+00 | loss scale: 32768.0 | grad norm: 153807.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3613/ 159576 | consumed samples: 68368 | elapsed time per iteration (ms): 14537.1 | learning rate: 1.895E-05 | global batch size: 32 | lm loss: 6.480092E+00 | loss scale: 32768.0 | grad norm: 191977.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3614/ 159576 | consumed samples: 68400 | elapsed time per iteration (ms): 14777.4 | learning rate: 1.895E-05 | global batch size: 32 | lm loss: 6.507137E+00 | loss scale: 32768.0 | grad norm: 147123.785 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3615/ 159576 | consumed samples: 68432 | elapsed time per iteration (ms): 14631.8 | learning rate: 1.896E-05 | global batch size: 32 | lm loss: 6.413469E+00 | loss scale: 32768.0 | grad norm: 151298.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3616/ 159576 | consumed samples: 68464 | elapsed time per iteration (ms): 14498.7 | learning rate: 1.897E-05 | global batch size: 32 | lm loss: 6.400654E+00 | loss scale: 32768.0 | grad norm: 144773.834 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3617/ 159576 | consumed samples: 68496 | elapsed time per iteration (ms): 14516.2 | learning rate: 1.898E-05 | global batch size: 32 | lm loss: 6.514056E+00 | loss scale: 32768.0 | grad norm: 212184.973 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3618/ 159576 | consumed samples: 68528 | elapsed time per iteration (ms): 15120.1 | learning rate: 1.899E-05 | global batch size: 32 | lm loss: 6.476982E+00 | loss scale: 32768.0 | grad norm: 138389.337 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3619/ 159576 | consumed samples: 68560 | elapsed time per iteration (ms): 14520.5 | learning rate: 1.900E-05 | global batch size: 32 | lm loss: 6.413394E+00 | loss scale: 32768.0 | grad norm: 144757.897 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3620/ 159576 | consumed samples: 68592 | elapsed time per iteration (ms): 14501.8 | learning rate: 1.901E-05 | global batch size: 32 | lm loss: 6.508588E+00 | loss scale: 32768.0 | grad norm: 119480.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3621/ 159576 | consumed samples: 68624 | elapsed time per iteration (ms): 14544.3 | learning rate: 1.902E-05 | global batch size: 32 | lm loss: 6.462088E+00 | loss scale: 32768.0 | grad norm: 118576.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3622/ 159576 | consumed samples: 68656 | elapsed time per iteration (ms): 14904.8 | learning rate: 1.903E-05 | global batch size: 32 | lm loss: 6.518481E+00 | loss scale: 32768.0 | grad norm: 166384.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3623/ 159576 | consumed samples: 68688 | elapsed time per iteration (ms): 14536.7 | learning rate: 1.903E-05 | global batch size: 32 | lm loss: 6.418991E+00 | loss scale: 32768.0 | grad norm: 133937.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3624/ 159576 | consumed samples: 68720 | elapsed time per iteration (ms): 14549.8 | learning rate: 1.904E-05 | global batch size: 32 | lm loss: 6.446878E+00 | loss scale: 32768.0 | grad norm: 270206.058 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3625/ 159576 | consumed samples: 68752 | elapsed time per iteration (ms): 14599.2 | learning rate: 1.905E-05 | global batch size: 32 | lm loss: 6.534576E+00 | loss scale: 32768.0 | grad norm: 155344.465 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3626/ 159576 | consumed samples: 68784 | elapsed time per iteration (ms): 14722.9 | learning rate: 1.906E-05 | global batch size: 32 | lm loss: 6.630429E+00 | loss scale: 32768.0 | grad norm: 199114.246 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3627/ 159576 | consumed samples: 68816 | elapsed time per iteration (ms): 14500.1 | learning rate: 1.907E-05 | global batch size: 32 | lm loss: 6.356173E+00 | loss scale: 32768.0 | grad norm: 167282.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3628/ 159576 | consumed samples: 68848 | elapsed time per iteration (ms): 14530.4 | learning rate: 1.908E-05 | global batch size: 32 | lm loss: 6.471046E+00 | loss scale: 32768.0 | grad norm: 208481.248 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3629/ 159576 | consumed samples: 68880 | elapsed time per iteration (ms): 14549.1 | learning rate: 1.909E-05 | global batch size: 32 | lm loss: 6.412348E+00 | loss scale: 32768.0 | grad norm: 149105.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3630/ 159576 | consumed samples: 68912 | elapsed time per iteration (ms): 14882.4 | learning rate: 1.910E-05 | global batch size: 32 | lm loss: 6.520298E+00 | loss scale: 32768.0 | grad norm: 123369.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3631/ 159576 | consumed samples: 68944 | elapsed time per iteration (ms): 14575.6 | learning rate: 1.911E-05 | global batch size: 32 | lm loss: 6.558264E+00 | loss scale: 32768.0 | grad norm: 243133.943 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3632/ 159576 | consumed samples: 68976 | elapsed time per iteration (ms): 14516.5 | learning rate: 1.911E-05 | global batch size: 32 | lm loss: 6.583918E+00 | loss scale: 32768.0 | grad norm: 178142.765 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3633/ 159576 | consumed samples: 69008 | elapsed time per iteration (ms): 14471.4 | learning rate: 1.912E-05 | global batch size: 32 | lm loss: 6.540310E+00 | loss scale: 32768.0 | grad norm: 189782.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3634/ 159576 | consumed samples: 69040 | elapsed time per iteration (ms): 14945.9 | learning rate: 1.913E-05 | global batch size: 32 | lm loss: 6.505736E+00 | loss scale: 32768.0 | grad norm: 165872.968 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3635/ 159576 | consumed samples: 69072 | elapsed time per iteration (ms): 14539.5 | learning rate: 1.914E-05 | global batch size: 32 | lm loss: 6.509236E+00 | loss scale: 32768.0 | grad norm: 245470.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3636/ 159576 | consumed samples: 69104 | elapsed time per iteration (ms): 14545.2 | learning rate: 1.915E-05 | global batch size: 32 | lm loss: 6.504992E+00 | loss scale: 32768.0 | grad norm: 150104.290 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3637/ 159576 | consumed samples: 69136 | elapsed time per iteration (ms): 14567.6 | learning rate: 1.916E-05 | global batch size: 32 | lm loss: 6.406890E+00 | loss scale: 32768.0 | grad norm: 135913.146 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3638/ 159576 | consumed samples: 69168 | elapsed time per iteration (ms): 14896.3 | learning rate: 1.917E-05 | global batch size: 32 | lm loss: 6.443694E+00 | loss scale: 32768.0 | grad norm: 185702.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3639/ 159576 | consumed samples: 69200 | elapsed time per iteration (ms): 14591.0 | learning rate: 1.918E-05 | global batch size: 32 | lm loss: 6.556330E+00 | loss scale: 32768.0 | grad norm: 244123.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3640/ 159576 | consumed samples: 69232 | elapsed time per iteration (ms): 14549.7 | learning rate: 1.918E-05 | global batch size: 32 | lm loss: 6.487778E+00 | loss scale: 32768.0 | grad norm: 177114.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3641/ 159576 | consumed samples: 69264 | elapsed time per iteration (ms): 14570.7 | learning rate: 1.919E-05 | global batch size: 32 | lm loss: 6.513255E+00 | loss scale: 32768.0 | grad norm: 131694.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3642/ 159576 | consumed samples: 69296 | elapsed time per iteration (ms): 14516.4 | learning rate: 1.920E-05 | global batch size: 32 | lm loss: 6.592026E+00 | loss scale: 32768.0 | grad norm: 290876.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3643/ 159576 | consumed samples: 69328 | elapsed time per iteration (ms): 14756.7 | learning rate: 1.921E-05 | global batch size: 32 | lm loss: 6.662066E+00 | loss scale: 32768.0 | grad norm: 228974.687 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3644/ 159576 | consumed samples: 69360 | elapsed time per iteration (ms): 14551.2 | learning rate: 1.922E-05 | global batch size: 32 | lm loss: 6.366663E+00 | loss scale: 32768.0 | grad norm: 161091.231 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3645/ 159576 | consumed samples: 69392 | elapsed time per iteration (ms): 14619.9 | learning rate: 1.923E-05 | global batch size: 32 | lm loss: 6.523453E+00 | loss scale: 32768.0 | grad norm: 136622.848 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3646/ 159576 | consumed samples: 69424 | elapsed time per iteration (ms): 14549.7 | learning rate: 1.924E-05 | global batch size: 32 | lm loss: 6.502388E+00 | loss scale: 32768.0 | grad norm: 233041.164 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3647/ 159576 | consumed samples: 69456 | elapsed time per iteration (ms): 14639.6 | learning rate: 1.925E-05 | global batch size: 32 | lm loss: 6.570889E+00 | loss scale: 32768.0 | grad norm: 177700.635 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3648/ 159576 | consumed samples: 69488 | elapsed time per iteration (ms): 14511.4 | learning rate: 1.926E-05 | global batch size: 32 | lm loss: 6.538668E+00 | loss scale: 32768.0 | grad norm: 167613.706 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3649/ 159576 | consumed samples: 69520 | elapsed time per iteration (ms): 14499.6 | learning rate: 1.926E-05 | global batch size: 32 | lm loss: 6.650812E+00 | loss scale: 32768.0 | grad norm: 144019.361 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3650/ 159576 | consumed samples: 69552 | elapsed time per iteration (ms): 14509.6 | learning rate: 1.927E-05 | global batch size: 32 | lm loss: 6.449777E+00 | loss scale: 32768.0 | grad norm: 190635.397 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3651/ 159576 | consumed samples: 69584 | elapsed time per iteration (ms): 14775.5 | learning rate: 1.928E-05 | global batch size: 32 | lm loss: 6.435673E+00 | loss scale: 32768.0 | grad norm: 181537.989 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3652/ 159576 | consumed samples: 69616 | elapsed time per iteration (ms): 14563.5 | learning rate: 1.929E-05 | global batch size: 32 | lm loss: 6.631623E+00 | loss scale: 32768.0 | grad norm: 150202.284 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3653/ 159576 | consumed samples: 69648 | elapsed time per iteration (ms): 14524.8 | learning rate: 1.930E-05 | global batch size: 32 | lm loss: 6.612866E+00 | loss scale: 32768.0 | grad norm: 136863.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3654/ 159576 | consumed samples: 69680 | elapsed time per iteration (ms): 14611.3 | learning rate: 1.931E-05 | global batch size: 32 | lm loss: 6.471664E+00 | loss scale: 32768.0 | grad norm: 177103.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3655/ 159576 | consumed samples: 69712 | elapsed time per iteration (ms): 14752.9 | learning rate: 1.932E-05 | global batch size: 32 | lm loss: 6.436707E+00 | loss scale: 32768.0 | grad norm: 107210.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3656/ 159576 | consumed samples: 69744 | elapsed time per iteration (ms): 14544.1 | learning rate: 1.933E-05 | global batch size: 32 | lm loss: 6.679466E+00 | loss scale: 32768.0 | grad norm: 156389.742 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3657/ 159576 | consumed samples: 69776 | elapsed time per iteration (ms): 14560.9 | learning rate: 1.934E-05 | global batch size: 32 | lm loss: 6.478530E+00 | loss scale: 32768.0 | grad norm: 136151.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3658/ 159576 | consumed samples: 69808 | elapsed time per iteration (ms): 14516.8 | learning rate: 1.934E-05 | global batch size: 32 | lm loss: 6.537941E+00 | loss scale: 32768.0 | grad norm: 169825.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3659/ 159576 | consumed samples: 69840 | elapsed time per iteration (ms): 15041.8 | learning rate: 1.935E-05 | global batch size: 32 | lm loss: 6.414840E+00 | loss scale: 32768.0 | grad norm: 116305.156 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3660/ 159576 | consumed samples: 69872 | elapsed time per iteration (ms): 14596.0 | learning rate: 1.936E-05 | global batch size: 32 | lm loss: 6.423607E+00 | loss scale: 32768.0 | grad norm: 157726.425 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3661/ 159576 | consumed samples: 69904 | elapsed time per iteration (ms): 14600.4 | learning rate: 1.937E-05 | global batch size: 32 | lm loss: 6.516055E+00 | loss scale: 32768.0 | grad norm: 150170.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3662/ 159576 | consumed samples: 69936 | elapsed time per iteration (ms): 14508.1 | learning rate: 1.938E-05 | global batch size: 32 | lm loss: 6.406610E+00 | loss scale: 32768.0 | grad norm: 180125.834 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3663/ 159576 | consumed samples: 69968 | elapsed time per iteration (ms): 14795.2 | learning rate: 1.939E-05 | global batch size: 32 | lm loss: 6.495340E+00 | loss scale: 32768.0 | grad norm: 156226.253 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3664/ 159576 | consumed samples: 70000 | elapsed time per iteration (ms): 14502.7 | learning rate: 1.940E-05 | global batch size: 32 | lm loss: 6.478324E+00 | loss scale: 32768.0 | grad norm: 139199.774 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3665/ 159576 | consumed samples: 70032 | elapsed time per iteration (ms): 14521.4 | learning rate: 1.941E-05 | global batch size: 32 | lm loss: 6.486080E+00 | loss scale: 32768.0 | grad norm: 139987.206 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3666/ 159576 | consumed samples: 70064 | elapsed time per iteration (ms): 14501.0 | learning rate: 1.942E-05 | global batch size: 32 | lm loss: 6.412463E+00 | loss scale: 32768.0 | grad norm: 187000.562 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3667/ 159576 | consumed samples: 70096 | elapsed time per iteration (ms): 14907.7 | learning rate: 1.942E-05 | global batch size: 32 | lm loss: 6.555160E+00 | loss scale: 32768.0 | grad norm: 151236.383 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3668/ 159576 | consumed samples: 70128 | elapsed time per iteration (ms): 14546.0 | learning rate: 1.943E-05 | global batch size: 32 | lm loss: 6.466833E+00 | loss scale: 32768.0 | grad norm: 188341.809 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3669/ 159576 | consumed samples: 70160 | elapsed time per iteration (ms): 14504.0 | learning rate: 1.944E-05 | global batch size: 32 | lm loss: 6.512917E+00 | loss scale: 32768.0 | grad norm: 142898.213 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3670/ 159576 | consumed samples: 70192 | elapsed time per iteration (ms): 14550.7 | learning rate: 1.945E-05 | global batch size: 32 | lm loss: 6.662933E+00 | loss scale: 32768.0 | grad norm: 155470.352 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3671/ 159576 | consumed samples: 70224 | elapsed time per iteration (ms): 14892.4 | learning rate: 1.946E-05 | global batch size: 32 | lm loss: 6.373161E+00 | loss scale: 32768.0 | grad norm: 150042.585 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3672/ 159576 | consumed samples: 70256 | elapsed time per iteration (ms): 14566.7 | learning rate: 1.947E-05 | global batch size: 32 | lm loss: 6.426474E+00 | loss scale: 32768.0 | grad norm: 170805.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3673/ 159576 | consumed samples: 70288 | elapsed time per iteration (ms): 14501.7 | learning rate: 1.948E-05 | global batch size: 32 | lm loss: 6.370544E+00 | loss scale: 32768.0 | grad norm: 138493.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3674/ 159576 | consumed samples: 70320 | elapsed time per iteration (ms): 14600.9 | learning rate: 1.949E-05 | global batch size: 32 | lm loss: 6.383911E+00 | loss scale: 32768.0 | grad norm: 137200.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3675/ 159576 | consumed samples: 70352 | elapsed time per iteration (ms): 14904.3 | learning rate: 1.950E-05 | global batch size: 32 | lm loss: 6.430146E+00 | loss scale: 32768.0 | grad norm: 130856.844 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3676/ 159576 | consumed samples: 70384 | elapsed time per iteration (ms): 14544.1 | learning rate: 1.950E-05 | global batch size: 32 | lm loss: 6.359234E+00 | loss scale: 32768.0 | grad norm: 123290.267 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3677/ 159576 | consumed samples: 70416 | elapsed time per iteration (ms): 14660.6 | learning rate: 1.951E-05 | global batch size: 32 | lm loss: 6.340640E+00 | loss scale: 32768.0 | grad norm: 128445.878 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3678/ 159576 | consumed samples: 70448 | elapsed time per iteration (ms): 14469.4 | learning rate: 1.952E-05 | global batch size: 32 | lm loss: 6.467716E+00 | loss scale: 32768.0 | grad norm: 222732.002 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3679/ 159576 | consumed samples: 70480 | elapsed time per iteration (ms): 14540.6 | learning rate: 1.953E-05 | global batch size: 32 | lm loss: 6.401999E+00 | loss scale: 32768.0 | grad norm: 143732.695 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3680/ 159576 | consumed samples: 70512 | elapsed time per iteration (ms): 14837.8 | learning rate: 1.954E-05 | global batch size: 32 | lm loss: 6.469200E+00 | loss scale: 32768.0 | grad norm: 148617.864 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3681/ 159576 | consumed samples: 70544 | elapsed time per iteration (ms): 14560.6 | learning rate: 1.955E-05 | global batch size: 32 | lm loss: 6.503996E+00 | loss scale: 32768.0 | grad norm: 151584.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3682/ 159576 | consumed samples: 70576 | elapsed time per iteration (ms): 14533.4 | learning rate: 1.956E-05 | global batch size: 32 | lm loss: 6.473675E+00 | loss scale: 32768.0 | grad norm: 171148.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3683/ 159576 | consumed samples: 70608 | elapsed time per iteration (ms): 14606.7 | learning rate: 1.957E-05 | global batch size: 32 | lm loss: 6.406356E+00 | loss scale: 32768.0 | grad norm: 139281.574 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3684/ 159576 | consumed samples: 70640 | elapsed time per iteration (ms): 14772.8 | learning rate: 1.958E-05 | global batch size: 32 | lm loss: 6.329139E+00 | loss scale: 32768.0 | grad norm: 108055.730 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3685/ 159576 | consumed samples: 70672 | elapsed time per iteration (ms): 14518.6 | learning rate: 1.958E-05 | global batch size: 32 | lm loss: 6.525671E+00 | loss scale: 32768.0 | grad norm: 204684.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3686/ 159576 | consumed samples: 70704 | elapsed time per iteration (ms): 14569.3 | learning rate: 1.959E-05 | global batch size: 32 | lm loss: 6.454522E+00 | loss scale: 32768.0 | grad norm: 108450.408 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3687/ 159576 | consumed samples: 70736 | elapsed time per iteration (ms): 14527.9 | learning rate: 1.960E-05 | global batch size: 32 | lm loss: 6.452621E+00 | loss scale: 32768.0 | grad norm: 154981.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3688/ 159576 | consumed samples: 70768 | elapsed time per iteration (ms): 14681.9 | learning rate: 1.961E-05 | global batch size: 32 | lm loss: 6.485929E+00 | loss scale: 32768.0 | grad norm: 132389.054 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3689/ 159576 | consumed samples: 70800 | elapsed time per iteration (ms): 14628.9 | learning rate: 1.962E-05 | global batch size: 32 | lm loss: 6.560607E+00 | loss scale: 32768.0 | grad norm: 244618.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3690/ 159576 | consumed samples: 70832 | elapsed time per iteration (ms): 14570.6 | learning rate: 1.963E-05 | global batch size: 32 | lm loss: 6.545405E+00 | loss scale: 32768.0 | grad norm: 207471.493 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3691/ 159576 | consumed samples: 70864 | elapsed time per iteration (ms): 14568.4 | learning rate: 1.964E-05 | global batch size: 32 | lm loss: 6.403141E+00 | loss scale: 32768.0 | grad norm: 160751.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3692/ 159576 | consumed samples: 70896 | elapsed time per iteration (ms): 14828.9 | learning rate: 1.965E-05 | global batch size: 32 | lm loss: 6.494320E+00 | loss scale: 32768.0 | grad norm: 142715.734 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3693/ 159576 | consumed samples: 70928 | elapsed time per iteration (ms): 14576.4 | learning rate: 1.966E-05 | global batch size: 32 | lm loss: 6.317194E+00 | loss scale: 32768.0 | grad norm: 218725.519 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3694/ 159576 | consumed samples: 70960 | elapsed time per iteration (ms): 14558.1 | learning rate: 1.966E-05 | global batch size: 32 | lm loss: 6.404289E+00 | loss scale: 32768.0 | grad norm: 133735.905 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3695/ 159576 | consumed samples: 70992 | elapsed time per iteration (ms): 14502.5 | learning rate: 1.967E-05 | global batch size: 32 | lm loss: 6.501413E+00 | loss scale: 32768.0 | grad norm: 126881.621 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3696/ 159576 | consumed samples: 71024 | elapsed time per iteration (ms): 14876.1 | learning rate: 1.968E-05 | global batch size: 32 | lm loss: 6.348512E+00 | loss scale: 32768.0 | grad norm: 117844.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3697/ 159576 | consumed samples: 71056 | elapsed time per iteration (ms): 14704.7 | learning rate: 1.969E-05 | global batch size: 32 | lm loss: 6.490881E+00 | loss scale: 32768.0 | grad norm: 191050.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3698/ 159576 | consumed samples: 71088 | elapsed time per iteration (ms): 14521.5 | learning rate: 1.970E-05 | global batch size: 32 | lm loss: 6.399506E+00 | loss scale: 32768.0 | grad norm: 131579.663 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3699/ 159576 | consumed samples: 71120 | elapsed time per iteration (ms): 14570.1 | learning rate: 1.971E-05 | global batch size: 32 | lm loss: 6.507861E+00 | loss scale: 32768.0 | grad norm: 124970.942 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3700/ 159576 | consumed samples: 71152 | elapsed time per iteration (ms): 15037.4 | learning rate: 1.972E-05 | global batch size: 32 | lm loss: 6.460707E+00 | loss scale: 32768.0 | grad norm: 163864.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3701/ 159576 | consumed samples: 71184 | elapsed time per iteration (ms): 14616.1 | learning rate: 1.973E-05 | global batch size: 32 | lm loss: 6.410345E+00 | loss scale: 32768.0 | grad norm: 155995.488 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3702/ 159576 | consumed samples: 71216 | elapsed time per iteration (ms): 14555.1 | learning rate: 1.974E-05 | global batch size: 32 | lm loss: 6.418409E+00 | loss scale: 32768.0 | grad norm: 135398.679 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3703/ 159576 | consumed samples: 71248 | elapsed time per iteration (ms): 14529.9 | learning rate: 1.974E-05 | global batch size: 32 | lm loss: 6.445669E+00 | loss scale: 32768.0 | grad norm: 149575.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3704/ 159576 | consumed samples: 71280 | elapsed time per iteration (ms): 14938.6 | learning rate: 1.975E-05 | global batch size: 32 | lm loss: 6.466682E+00 | loss scale: 32768.0 | grad norm: 158480.859 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3705/ 159576 | consumed samples: 71312 | elapsed time per iteration (ms): 14501.2 | learning rate: 1.976E-05 | global batch size: 32 | lm loss: 6.391745E+00 | loss scale: 32768.0 | grad norm: 130405.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3706/ 159576 | consumed samples: 71344 | elapsed time per iteration (ms): 14560.8 | learning rate: 1.977E-05 | global batch size: 32 | lm loss: 6.367959E+00 | loss scale: 32768.0 | grad norm: 134894.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3707/ 159576 | consumed samples: 71376 | elapsed time per iteration (ms): 14606.1 | learning rate: 1.978E-05 | global batch size: 32 | lm loss: 6.568520E+00 | loss scale: 32768.0 | grad norm: 127252.532 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3708/ 159576 | consumed samples: 71408 | elapsed time per iteration (ms): 14831.0 | learning rate: 1.979E-05 | global batch size: 32 | lm loss: 6.451063E+00 | loss scale: 32768.0 | grad norm: 352497.893 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3709/ 159576 | consumed samples: 71440 | elapsed time per iteration (ms): 14547.0 | learning rate: 1.980E-05 | global batch size: 32 | lm loss: 6.534979E+00 | loss scale: 32768.0 | grad norm: 139565.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3710/ 159576 | consumed samples: 71472 | elapsed time per iteration (ms): 14583.9 | learning rate: 1.981E-05 | global batch size: 32 | lm loss: 6.561714E+00 | loss scale: 32768.0 | grad norm: 190647.531 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3711/ 159576 | consumed samples: 71504 | elapsed time per iteration (ms): 14605.2 | learning rate: 1.982E-05 | global batch size: 32 | lm loss: 6.594619E+00 | loss scale: 32768.0 | grad norm: 159179.628 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3712/ 159576 | consumed samples: 71536 | elapsed time per iteration (ms): 14853.8 | learning rate: 1.982E-05 | global batch size: 32 | lm loss: 6.221584E+00 | loss scale: 32768.0 | grad norm: 163662.318 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3713/ 159576 | consumed samples: 71568 | elapsed time per iteration (ms): 14625.6 | learning rate: 1.983E-05 | global batch size: 32 | lm loss: 6.384083E+00 | loss scale: 32768.0 | grad norm: 157426.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3714/ 159576 | consumed samples: 71600 | elapsed time per iteration (ms): 14617.1 | learning rate: 1.984E-05 | global batch size: 32 | lm loss: 6.457389E+00 | loss scale: 32768.0 | grad norm: 163827.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3715/ 159576 | consumed samples: 71632 | elapsed time per iteration (ms): 14519.7 | learning rate: 1.985E-05 | global batch size: 32 | lm loss: 6.461262E+00 | loss scale: 32768.0 | grad norm: 150641.403 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3716/ 159576 | consumed samples: 71664 | elapsed time per iteration (ms): 14921.5 | learning rate: 1.986E-05 | global batch size: 32 | lm loss: 6.345608E+00 | loss scale: 32768.0 | grad norm: 146728.063 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3717/ 159576 | consumed samples: 71696 | elapsed time per iteration (ms): 14643.5 | learning rate: 1.987E-05 | global batch size: 32 | lm loss: 6.488680E+00 | loss scale: 32768.0 | grad norm: 159547.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3718/ 159576 | consumed samples: 71728 | elapsed time per iteration (ms): 14531.6 | learning rate: 1.988E-05 | global batch size: 32 | lm loss: 6.358843E+00 | loss scale: 32768.0 | grad norm: 120331.967 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3719/ 159576 | consumed samples: 71760 | elapsed time per iteration (ms): 14544.0 | learning rate: 1.989E-05 | global batch size: 32 | lm loss: 6.480108E+00 | loss scale: 32768.0 | grad norm: 136903.050 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3720/ 159576 | consumed samples: 71792 | elapsed time per iteration (ms): 14789.8 | learning rate: 1.989E-05 | global batch size: 32 | lm loss: 6.423407E+00 | loss scale: 32768.0 | grad norm: 144666.737 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3721/ 159576 | consumed samples: 71824 | elapsed time per iteration (ms): 14759.3 | learning rate: 1.990E-05 | global batch size: 32 | lm loss: 6.280478E+00 | loss scale: 32768.0 | grad norm: 131505.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3722/ 159576 | consumed samples: 71856 | elapsed time per iteration (ms): 14493.1 | learning rate: 1.991E-05 | global batch size: 32 | lm loss: 6.341520E+00 | loss scale: 32768.0 | grad norm: 153861.927 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3723/ 159576 | consumed samples: 71888 | elapsed time per iteration (ms): 14523.6 | learning rate: 1.992E-05 | global batch size: 32 | lm loss: 6.470270E+00 | loss scale: 32768.0 | grad norm: 129755.757 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3724/ 159576 | consumed samples: 71920 | elapsed time per iteration (ms): 14486.1 | learning rate: 1.993E-05 | global batch size: 32 | lm loss: 6.425168E+00 | loss scale: 32768.0 | grad norm: 117324.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3725/ 159576 | consumed samples: 71952 | elapsed time per iteration (ms): 14760.5 | learning rate: 1.994E-05 | global batch size: 32 | lm loss: 6.508280E+00 | loss scale: 32768.0 | grad norm: 128492.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3726/ 159576 | consumed samples: 71984 | elapsed time per iteration (ms): 14523.7 | learning rate: 1.995E-05 | global batch size: 32 | lm loss: 6.451111E+00 | loss scale: 32768.0 | grad norm: 167230.725 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3727/ 159576 | consumed samples: 72016 | elapsed time per iteration (ms): 14569.3 | learning rate: 1.996E-05 | global batch size: 32 | lm loss: 6.428119E+00 | loss scale: 32768.0 | grad norm: 118648.215 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3728/ 159576 | consumed samples: 72048 | elapsed time per iteration (ms): 14495.2 | learning rate: 1.997E-05 | global batch size: 32 | lm loss: 6.472005E+00 | loss scale: 32768.0 | grad norm: 129074.469 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3729/ 159576 | consumed samples: 72080 | elapsed time per iteration (ms): 14750.9 | learning rate: 1.997E-05 | global batch size: 32 | lm loss: 6.501527E+00 | loss scale: 32768.0 | grad norm: 149114.403 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3730/ 159576 | consumed samples: 72112 | elapsed time per iteration (ms): 14542.0 | learning rate: 1.998E-05 | global batch size: 32 | lm loss: 6.441484E+00 | loss scale: 32768.0 | grad norm: 115103.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3731/ 159576 | consumed samples: 72144 | elapsed time per iteration (ms): 14563.9 | learning rate: 1.999E-05 | global batch size: 32 | lm loss: 6.365570E+00 | loss scale: 32768.0 | grad norm: 122866.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3732/ 159576 | consumed samples: 72176 | elapsed time per iteration (ms): 14514.0 | learning rate: 2.000E-05 | global batch size: 32 | lm loss: 6.432354E+00 | loss scale: 32768.0 | grad norm: 117503.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3733/ 159576 | consumed samples: 72208 | elapsed time per iteration (ms): 14782.6 | learning rate: 2.001E-05 | global batch size: 32 | lm loss: 6.406446E+00 | loss scale: 32768.0 | grad norm: 118771.624 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3734/ 159576 | consumed samples: 72240 | elapsed time per iteration (ms): 14599.5 | learning rate: 2.002E-05 | global batch size: 32 | lm loss: 6.564467E+00 | loss scale: 32768.0 | grad norm: 113605.510 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3735/ 159576 | consumed samples: 72272 | elapsed time per iteration (ms): 14490.9 | learning rate: 2.003E-05 | global batch size: 32 | lm loss: 6.709463E+00 | loss scale: 32768.0 | grad norm: 143048.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3736/ 159576 | consumed samples: 72304 | elapsed time per iteration (ms): 14616.2 | learning rate: 2.004E-05 | global batch size: 32 | lm loss: 6.388952E+00 | loss scale: 32768.0 | grad norm: 148752.246 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3737/ 159576 | consumed samples: 72336 | elapsed time per iteration (ms): 14690.4 | learning rate: 2.005E-05 | global batch size: 32 | lm loss: 6.671305E+00 | loss scale: 32768.0 | grad norm: 167080.674 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3738/ 159576 | consumed samples: 72368 | elapsed time per iteration (ms): 14577.2 | learning rate: 2.005E-05 | global batch size: 32 | lm loss: 6.441625E+00 | loss scale: 32768.0 | grad norm: 132744.798 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3739/ 159576 | consumed samples: 72400 | elapsed time per iteration (ms): 14526.3 | learning rate: 2.006E-05 | global batch size: 32 | lm loss: 6.382997E+00 | loss scale: 32768.0 | grad norm: 137597.004 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3740/ 159576 | consumed samples: 72432 | elapsed time per iteration (ms): 14497.0 | learning rate: 2.007E-05 | global batch size: 32 | lm loss: 6.423009E+00 | loss scale: 32768.0 | grad norm: 158026.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3741/ 159576 | consumed samples: 72464 | elapsed time per iteration (ms): 14972.2 | learning rate: 2.008E-05 | global batch size: 32 | lm loss: 6.350714E+00 | loss scale: 32768.0 | grad norm: 133556.244 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3742/ 159576 | consumed samples: 72496 | elapsed time per iteration (ms): 14524.0 | learning rate: 2.009E-05 | global batch size: 32 | lm loss: 6.481720E+00 | loss scale: 32768.0 | grad norm: 111295.642 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3743/ 159576 | consumed samples: 72528 | elapsed time per iteration (ms): 14585.5 | learning rate: 2.010E-05 | global batch size: 32 | lm loss: 6.427812E+00 | loss scale: 32768.0 | grad norm: 147125.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3744/ 159576 | consumed samples: 72560 | elapsed time per iteration (ms): 14494.4 | learning rate: 2.011E-05 | global batch size: 32 | lm loss: 6.548944E+00 | loss scale: 32768.0 | grad norm: 157070.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3745/ 159576 | consumed samples: 72592 | elapsed time per iteration (ms): 14860.3 | learning rate: 2.012E-05 | global batch size: 32 | lm loss: 6.524699E+00 | loss scale: 32768.0 | grad norm: 133650.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3746/ 159576 | consumed samples: 72624 | elapsed time per iteration (ms): 14524.8 | learning rate: 2.013E-05 | global batch size: 32 | lm loss: 6.462801E+00 | loss scale: 32768.0 | grad norm: 145785.393 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3747/ 159576 | consumed samples: 72656 | elapsed time per iteration (ms): 14508.2 | learning rate: 2.013E-05 | global batch size: 32 | lm loss: 6.505124E+00 | loss scale: 32768.0 | grad norm: 159039.833 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3748/ 159576 | consumed samples: 72688 | elapsed time per iteration (ms): 14534.8 | learning rate: 2.014E-05 | global batch size: 32 | lm loss: 6.554813E+00 | loss scale: 32768.0 | grad norm: 144107.066 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3749/ 159576 | consumed samples: 72720 | elapsed time per iteration (ms): 14885.2 | learning rate: 2.015E-05 | global batch size: 32 | lm loss: 6.509037E+00 | loss scale: 32768.0 | grad norm: 139312.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3750/ 159576 | consumed samples: 72752 | elapsed time per iteration (ms): 14531.0 | learning rate: 2.016E-05 | global batch size: 32 | lm loss: 6.393044E+00 | loss scale: 32768.0 | grad norm: 177829.341 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3751/ 159576 | consumed samples: 72784 | elapsed time per iteration (ms): 14500.7 | learning rate: 2.017E-05 | global batch size: 32 | lm loss: 6.362189E+00 | loss scale: 32768.0 | grad norm: 176679.914 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3752/ 159576 | consumed samples: 72816 | elapsed time per iteration (ms): 14533.8 | learning rate: 2.018E-05 | global batch size: 32 | lm loss: 6.594802E+00 | loss scale: 32768.0 | grad norm: 172136.738 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3753/ 159576 | consumed samples: 72848 | elapsed time per iteration (ms): 7743.9 | learning rate: 2.018E-05 | global batch size: 32 | lm loss: 6.535247E+00 | loss scale: 32768.0 | grad norm: 172136.738 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3754/ 159576 | consumed samples: 72880 | elapsed time per iteration (ms): 14383.1 | learning rate: 2.019E-05 | global batch size: 32 | lm loss: 6.354399E+00 | loss scale: 32768.0 | grad norm: 126648.407 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3755/ 159576 | consumed samples: 72912 | elapsed time per iteration (ms): 14590.3 | learning rate: 2.020E-05 | global batch size: 32 | lm loss: 6.473662E+00 | loss scale: 32768.0 | grad norm: 156295.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3756/ 159576 | consumed samples: 72944 | elapsed time per iteration (ms): 7767.7 | learning rate: 2.020E-05 | global batch size: 32 | lm loss: 6.609807E+00 | loss scale: 16384.0 | grad norm: 156295.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3757/ 159576 | consumed samples: 72976 | elapsed time per iteration (ms): 14046.4 | learning rate: 2.021E-05 | global batch size: 32 | lm loss: 6.389218E+00 | loss scale: 16384.0 | grad norm: 71738.658 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3758/ 159576 | consumed samples: 73008 | elapsed time per iteration (ms): 14805.7 | learning rate: 2.021E-05 | global batch size: 32 | lm loss: 6.361919E+00 | loss scale: 16384.0 | grad norm: 60700.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3759/ 159576 | consumed samples: 73040 | elapsed time per iteration (ms): 14722.8 | learning rate: 2.022E-05 | global batch size: 32 | lm loss: 6.447733E+00 | loss scale: 16384.0 | grad norm: 87663.180 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3760/ 159576 | consumed samples: 73072 | elapsed time per iteration (ms): 14583.0 | learning rate: 2.023E-05 | global batch size: 32 | lm loss: 6.446470E+00 | loss scale: 16384.0 | grad norm: 67781.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3761/ 159576 | consumed samples: 73104 | elapsed time per iteration (ms): 14493.9 | learning rate: 2.024E-05 | global batch size: 32 | lm loss: 6.378415E+00 | loss scale: 16384.0 | grad norm: 72177.747 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3762/ 159576 | consumed samples: 73136 | elapsed time per iteration (ms): 14567.8 | learning rate: 2.025E-05 | global batch size: 32 | lm loss: 6.576702E+00 | loss scale: 16384.0 | grad norm: 87501.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3763/ 159576 | consumed samples: 73168 | elapsed time per iteration (ms): 14732.6 | learning rate: 2.026E-05 | global batch size: 32 | lm loss: 6.522850E+00 | loss scale: 16384.0 | grad norm: 66784.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3764/ 159576 | consumed samples: 73200 | elapsed time per iteration (ms): 14572.5 | learning rate: 2.027E-05 | global batch size: 32 | lm loss: 6.361198E+00 | loss scale: 16384.0 | grad norm: 85761.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3765/ 159576 | consumed samples: 73232 | elapsed time per iteration (ms): 14647.5 | learning rate: 2.028E-05 | global batch size: 32 | lm loss: 6.605127E+00 | loss scale: 16384.0 | grad norm: 69863.144 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3766/ 159576 | consumed samples: 73264 | elapsed time per iteration (ms): 14606.0 | learning rate: 2.029E-05 | global batch size: 32 | lm loss: 6.398610E+00 | loss scale: 16384.0 | grad norm: 94809.931 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3767/ 159576 | consumed samples: 73296 | elapsed time per iteration (ms): 14708.7 | learning rate: 2.029E-05 | global batch size: 32 | lm loss: 6.484084E+00 | loss scale: 16384.0 | grad norm: 74741.244 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3768/ 159576 | consumed samples: 73328 | elapsed time per iteration (ms): 14555.4 | learning rate: 2.030E-05 | global batch size: 32 | lm loss: 6.496735E+00 | loss scale: 16384.0 | grad norm: 77000.443 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3769/ 159576 | consumed samples: 73360 | elapsed time per iteration (ms): 14556.9 | learning rate: 2.031E-05 | global batch size: 32 | lm loss: 6.386226E+00 | loss scale: 16384.0 | grad norm: 92155.881 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3770/ 159576 | consumed samples: 73392 | elapsed time per iteration (ms): 14623.6 | learning rate: 2.032E-05 | global batch size: 32 | lm loss: 6.446381E+00 | loss scale: 16384.0 | grad norm: 91554.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3771/ 159576 | consumed samples: 73424 | elapsed time per iteration (ms): 14736.8 | learning rate: 2.033E-05 | global batch size: 32 | lm loss: 6.477424E+00 | loss scale: 16384.0 | grad norm: 79287.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3772/ 159576 | consumed samples: 73456 | elapsed time per iteration (ms): 14586.8 | learning rate: 2.034E-05 | global batch size: 32 | lm loss: 6.505037E+00 | loss scale: 16384.0 | grad norm: 76395.186 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3773/ 159576 | consumed samples: 73488 | elapsed time per iteration (ms): 14638.2 | learning rate: 2.035E-05 | global batch size: 32 | lm loss: 6.536213E+00 | loss scale: 16384.0 | grad norm: 64411.593 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3774/ 159576 | consumed samples: 73520 | elapsed time per iteration (ms): 14533.1 | learning rate: 2.036E-05 | global batch size: 32 | lm loss: 6.477271E+00 | loss scale: 16384.0 | grad norm: 79531.325 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3775/ 159576 | consumed samples: 73552 | elapsed time per iteration (ms): 14956.5 | learning rate: 2.037E-05 | global batch size: 32 | lm loss: 6.364020E+00 | loss scale: 16384.0 | grad norm: 72312.067 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3776/ 159576 | consumed samples: 73584 | elapsed time per iteration (ms): 14572.0 | learning rate: 2.037E-05 | global batch size: 32 | lm loss: 6.331044E+00 | loss scale: 16384.0 | grad norm: 84164.363 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3777/ 159576 | consumed samples: 73616 | elapsed time per iteration (ms): 14594.9 | learning rate: 2.038E-05 | global batch size: 32 | lm loss: 6.512950E+00 | loss scale: 16384.0 | grad norm: 77822.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3778/ 159576 | consumed samples: 73648 | elapsed time per iteration (ms): 14607.5 | learning rate: 2.039E-05 | global batch size: 32 | lm loss: 6.549839E+00 | loss scale: 16384.0 | grad norm: 66443.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3779/ 159576 | consumed samples: 73680 | elapsed time per iteration (ms): 14999.4 | learning rate: 2.040E-05 | global batch size: 32 | lm loss: 6.475536E+00 | loss scale: 16384.0 | grad norm: 88572.452 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3780/ 159576 | consumed samples: 73712 | elapsed time per iteration (ms): 14681.3 | learning rate: 2.041E-05 | global batch size: 32 | lm loss: 6.548042E+00 | loss scale: 16384.0 | grad norm: 74648.598 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3781/ 159576 | consumed samples: 73744 | elapsed time per iteration (ms): 14610.5 | learning rate: 2.042E-05 | global batch size: 32 | lm loss: 6.445394E+00 | loss scale: 16384.0 | grad norm: 79663.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3782/ 159576 | consumed samples: 73776 | elapsed time per iteration (ms): 14624.0 | learning rate: 2.043E-05 | global batch size: 32 | lm loss: 6.496744E+00 | loss scale: 16384.0 | grad norm: 77740.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3783/ 159576 | consumed samples: 73808 | elapsed time per iteration (ms): 15155.7 | learning rate: 2.044E-05 | global batch size: 32 | lm loss: 6.402834E+00 | loss scale: 16384.0 | grad norm: 74857.589 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3784/ 159576 | consumed samples: 73840 | elapsed time per iteration (ms): 14584.9 | learning rate: 2.045E-05 | global batch size: 32 | lm loss: 6.375038E+00 | loss scale: 16384.0 | grad norm: 86117.345 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3785/ 159576 | consumed samples: 73872 | elapsed time per iteration (ms): 14634.8 | learning rate: 2.045E-05 | global batch size: 32 | lm loss: 6.507965E+00 | loss scale: 16384.0 | grad norm: 78691.029 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3786/ 159576 | consumed samples: 73904 | elapsed time per iteration (ms): 14635.7 | learning rate: 2.046E-05 | global batch size: 32 | lm loss: 6.375463E+00 | loss scale: 16384.0 | grad norm: 105222.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3787/ 159576 | consumed samples: 73936 | elapsed time per iteration (ms): 14981.3 | learning rate: 2.047E-05 | global batch size: 32 | lm loss: 6.494486E+00 | loss scale: 16384.0 | grad norm: 70745.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3788/ 159576 | consumed samples: 73968 | elapsed time per iteration (ms): 14576.6 | learning rate: 2.048E-05 | global batch size: 32 | lm loss: 6.350873E+00 | loss scale: 16384.0 | grad norm: 81350.508 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3789/ 159576 | consumed samples: 74000 | elapsed time per iteration (ms): 14674.5 | learning rate: 2.049E-05 | global batch size: 32 | lm loss: 6.467069E+00 | loss scale: 16384.0 | grad norm: 84086.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3790/ 159576 | consumed samples: 74032 | elapsed time per iteration (ms): 14585.2 | learning rate: 2.050E-05 | global batch size: 32 | lm loss: 6.420381E+00 | loss scale: 16384.0 | grad norm: 79517.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3791/ 159576 | consumed samples: 74064 | elapsed time per iteration (ms): 14845.4 | learning rate: 2.051E-05 | global batch size: 32 | lm loss: 6.528859E+00 | loss scale: 16384.0 | grad norm: 87747.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3792/ 159576 | consumed samples: 74096 | elapsed time per iteration (ms): 14671.9 | learning rate: 2.052E-05 | global batch size: 32 | lm loss: 6.445452E+00 | loss scale: 16384.0 | grad norm: 76185.902 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3793/ 159576 | consumed samples: 74128 | elapsed time per iteration (ms): 14614.2 | learning rate: 2.053E-05 | global batch size: 32 | lm loss: 6.579043E+00 | loss scale: 16384.0 | grad norm: 85891.467 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3794/ 159576 | consumed samples: 74160 | elapsed time per iteration (ms): 14636.7 | learning rate: 2.053E-05 | global batch size: 32 | lm loss: 6.481782E+00 | loss scale: 16384.0 | grad norm: 62633.733 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3795/ 159576 | consumed samples: 74192 | elapsed time per iteration (ms): 14963.5 | learning rate: 2.054E-05 | global batch size: 32 | lm loss: 6.517486E+00 | loss scale: 16384.0 | grad norm: 67403.184 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3796/ 159576 | consumed samples: 74224 | elapsed time per iteration (ms): 14620.1 | learning rate: 2.055E-05 | global batch size: 32 | lm loss: 6.417095E+00 | loss scale: 16384.0 | grad norm: 62157.167 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3797/ 159576 | consumed samples: 74256 | elapsed time per iteration (ms): 14620.8 | learning rate: 2.056E-05 | global batch size: 32 | lm loss: 6.419306E+00 | loss scale: 16384.0 | grad norm: 73456.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3798/ 159576 | consumed samples: 74288 | elapsed time per iteration (ms): 14577.9 | learning rate: 2.057E-05 | global batch size: 32 | lm loss: 6.487021E+00 | loss scale: 16384.0 | grad norm: 67613.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3799/ 159576 | consumed samples: 74320 | elapsed time per iteration (ms): 14963.8 | learning rate: 2.058E-05 | global batch size: 32 | lm loss: 6.459682E+00 | loss scale: 16384.0 | grad norm: 73515.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3800/ 159576 | consumed samples: 74352 | elapsed time per iteration (ms): 14567.9 | learning rate: 2.059E-05 | global batch size: 32 | lm loss: 6.321566E+00 | loss scale: 16384.0 | grad norm: 77546.231 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3801/ 159576 | consumed samples: 74384 | elapsed time per iteration (ms): 14600.7 | learning rate: 2.060E-05 | global batch size: 32 | lm loss: 6.582398E+00 | loss scale: 16384.0 | grad norm: 78424.143 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3802/ 159576 | consumed samples: 74416 | elapsed time per iteration (ms): 14644.4 | learning rate: 2.061E-05 | global batch size: 32 | lm loss: 6.394701E+00 | loss scale: 16384.0 | grad norm: 82174.617 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3803/ 159576 | consumed samples: 74448 | elapsed time per iteration (ms): 14905.7 | learning rate: 2.061E-05 | global batch size: 32 | lm loss: 6.388845E+00 | loss scale: 16384.0 | grad norm: 67050.595 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3804/ 159576 | consumed samples: 74480 | elapsed time per iteration (ms): 14636.0 | learning rate: 2.062E-05 | global batch size: 32 | lm loss: 6.513092E+00 | loss scale: 16384.0 | grad norm: 118423.488 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3805/ 159576 | consumed samples: 74512 | elapsed time per iteration (ms): 14511.9 | learning rate: 2.063E-05 | global batch size: 32 | lm loss: 6.418696E+00 | loss scale: 16384.0 | grad norm: 71096.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3806/ 159576 | consumed samples: 74544 | elapsed time per iteration (ms): 14523.9 | learning rate: 2.064E-05 | global batch size: 32 | lm loss: 6.286570E+00 | loss scale: 16384.0 | grad norm: 93004.901 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3807/ 159576 | consumed samples: 74576 | elapsed time per iteration (ms): 14509.8 | learning rate: 2.065E-05 | global batch size: 32 | lm loss: 6.565314E+00 | loss scale: 16384.0 | grad norm: 76207.242 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3808/ 159576 | consumed samples: 74608 | elapsed time per iteration (ms): 15001.7 | learning rate: 2.066E-05 | global batch size: 32 | lm loss: 6.597963E+00 | loss scale: 16384.0 | grad norm: 136405.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3809/ 159576 | consumed samples: 74640 | elapsed time per iteration (ms): 14540.5 | learning rate: 2.067E-05 | global batch size: 32 | lm loss: 6.619783E+00 | loss scale: 16384.0 | grad norm: 75270.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3810/ 159576 | consumed samples: 74672 | elapsed time per iteration (ms): 14582.3 | learning rate: 2.068E-05 | global batch size: 32 | lm loss: 6.406981E+00 | loss scale: 16384.0 | grad norm: 81052.948 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3811/ 159576 | consumed samples: 74704 | elapsed time per iteration (ms): 14512.1 | learning rate: 2.068E-05 | global batch size: 32 | lm loss: 6.487488E+00 | loss scale: 16384.0 | grad norm: 87400.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3812/ 159576 | consumed samples: 74736 | elapsed time per iteration (ms): 14767.4 | learning rate: 2.069E-05 | global batch size: 32 | lm loss: 6.416305E+00 | loss scale: 16384.0 | grad norm: 104809.852 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3813/ 159576 | consumed samples: 74768 | elapsed time per iteration (ms): 14457.6 | learning rate: 2.070E-05 | global batch size: 32 | lm loss: 6.405777E+00 | loss scale: 16384.0 | grad norm: 79282.350 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3814/ 159576 | consumed samples: 74800 | elapsed time per iteration (ms): 14520.7 | learning rate: 2.071E-05 | global batch size: 32 | lm loss: 6.435395E+00 | loss scale: 16384.0 | grad norm: 75788.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3815/ 159576 | consumed samples: 74832 | elapsed time per iteration (ms): 14520.3 | learning rate: 2.072E-05 | global batch size: 32 | lm loss: 6.324138E+00 | loss scale: 16384.0 | grad norm: 77448.416 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3816/ 159576 | consumed samples: 74864 | elapsed time per iteration (ms): 14756.0 | learning rate: 2.073E-05 | global batch size: 32 | lm loss: 6.479269E+00 | loss scale: 16384.0 | grad norm: 80928.548 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3817/ 159576 | consumed samples: 74896 | elapsed time per iteration (ms): 14631.8 | learning rate: 2.074E-05 | global batch size: 32 | lm loss: 6.448977E+00 | loss scale: 16384.0 | grad norm: 81667.758 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3818/ 159576 | consumed samples: 74928 | elapsed time per iteration (ms): 14631.1 | learning rate: 2.075E-05 | global batch size: 32 | lm loss: 6.550106E+00 | loss scale: 16384.0 | grad norm: 65592.243 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3819/ 159576 | consumed samples: 74960 | elapsed time per iteration (ms): 14596.0 | learning rate: 2.076E-05 | global batch size: 32 | lm loss: 6.589513E+00 | loss scale: 16384.0 | grad norm: 93606.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3820/ 159576 | consumed samples: 74992 | elapsed time per iteration (ms): 14800.0 | learning rate: 2.076E-05 | global batch size: 32 | lm loss: 6.472552E+00 | loss scale: 16384.0 | grad norm: 63974.308 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3821/ 159576 | consumed samples: 75024 | elapsed time per iteration (ms): 14588.9 | learning rate: 2.077E-05 | global batch size: 32 | lm loss: 6.366886E+00 | loss scale: 16384.0 | grad norm: 87736.372 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3822/ 159576 | consumed samples: 75056 | elapsed time per iteration (ms): 14606.9 | learning rate: 2.078E-05 | global batch size: 32 | lm loss: 6.523769E+00 | loss scale: 16384.0 | grad norm: 81803.750 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3823/ 159576 | consumed samples: 75088 | elapsed time per iteration (ms): 14588.5 | learning rate: 2.079E-05 | global batch size: 32 | lm loss: 6.495326E+00 | loss scale: 16384.0 | grad norm: 63058.316 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3824/ 159576 | consumed samples: 75120 | elapsed time per iteration (ms): 14986.3 | learning rate: 2.080E-05 | global batch size: 32 | lm loss: 6.557096E+00 | loss scale: 16384.0 | grad norm: 80592.237 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3825/ 159576 | consumed samples: 75152 | elapsed time per iteration (ms): 14684.5 | learning rate: 2.081E-05 | global batch size: 32 | lm loss: 6.436917E+00 | loss scale: 16384.0 | grad norm: 70196.423 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3826/ 159576 | consumed samples: 75184 | elapsed time per iteration (ms): 14627.4 | learning rate: 2.082E-05 | global batch size: 32 | lm loss: 6.413104E+00 | loss scale: 16384.0 | grad norm: 81085.772 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 17:07:03] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 17:07:03] PULSE: tr8-104B is running for 11:14:52 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 3827/ 159576 | consumed samples: 75216 | elapsed time per iteration (ms): 14531.9 | learning rate: 2.083E-05 | global batch size: 32 | lm loss: 6.427704E+00 | loss scale: 16384.0 | grad norm: 68943.205 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3828/ 159576 | consumed samples: 75248 | elapsed time per iteration (ms): 14988.1 | learning rate: 2.084E-05 | global batch size: 32 | lm loss: 6.347779E+00 | loss scale: 16384.0 | grad norm: 64095.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3829/ 159576 | consumed samples: 75280 | elapsed time per iteration (ms): 14665.9 | learning rate: 2.084E-05 | global batch size: 32 | lm loss: 6.411919E+00 | loss scale: 16384.0 | grad norm: 82008.163 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3830/ 159576 | consumed samples: 75312 | elapsed time per iteration (ms): 14539.9 | learning rate: 2.085E-05 | global batch size: 32 | lm loss: 6.458866E+00 | loss scale: 16384.0 | grad norm: 67971.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3831/ 159576 | consumed samples: 75344 | elapsed time per iteration (ms): 14600.2 | learning rate: 2.086E-05 | global batch size: 32 | lm loss: 6.450158E+00 | loss scale: 16384.0 | grad norm: 59376.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3832/ 159576 | consumed samples: 75376 | elapsed time per iteration (ms): 14931.8 | learning rate: 2.087E-05 | global batch size: 32 | lm loss: 6.537256E+00 | loss scale: 16384.0 | grad norm: 77538.560 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3833/ 159576 | consumed samples: 75408 | elapsed time per iteration (ms): 14592.6 | learning rate: 2.088E-05 | global batch size: 32 | lm loss: 6.392985E+00 | loss scale: 16384.0 | grad norm: 84275.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3834/ 159576 | consumed samples: 75440 | elapsed time per iteration (ms): 14616.6 | learning rate: 2.089E-05 | global batch size: 32 | lm loss: 6.512251E+00 | loss scale: 16384.0 | grad norm: 80167.095 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3835/ 159576 | consumed samples: 75472 | elapsed time per iteration (ms): 14584.0 | learning rate: 2.090E-05 | global batch size: 32 | lm loss: 6.467295E+00 | loss scale: 16384.0 | grad norm: 85124.328 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3836/ 159576 | consumed samples: 75504 | elapsed time per iteration (ms): 14844.3 | learning rate: 2.091E-05 | global batch size: 32 | lm loss: 6.514040E+00 | loss scale: 16384.0 | grad norm: 71539.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3837/ 159576 | consumed samples: 75536 | elapsed time per iteration (ms): 14618.8 | learning rate: 2.092E-05 | global batch size: 32 | lm loss: 6.519591E+00 | loss scale: 16384.0 | grad norm: 89173.398 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3838/ 159576 | consumed samples: 75568 | elapsed time per iteration (ms): 14566.0 | learning rate: 2.092E-05 | global batch size: 32 | lm loss: 6.447284E+00 | loss scale: 16384.0 | grad norm: 86030.395 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3839/ 159576 | consumed samples: 75600 | elapsed time per iteration (ms): 14636.3 | learning rate: 2.093E-05 | global batch size: 32 | lm loss: 6.369718E+00 | loss scale: 16384.0 | grad norm: 66275.400 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3840/ 159576 | consumed samples: 75632 | elapsed time per iteration (ms): 14897.9 | learning rate: 2.094E-05 | global batch size: 32 | lm loss: 6.467171E+00 | loss scale: 16384.0 | grad norm: 82043.402 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3841/ 159576 | consumed samples: 75664 | elapsed time per iteration (ms): 14554.8 | learning rate: 2.095E-05 | global batch size: 32 | lm loss: 6.458669E+00 | loss scale: 16384.0 | grad norm: 73761.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3842/ 159576 | consumed samples: 75696 | elapsed time per iteration (ms): 14564.2 | learning rate: 2.096E-05 | global batch size: 32 | lm loss: 6.516797E+00 | loss scale: 16384.0 | grad norm: 83647.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3843/ 159576 | consumed samples: 75728 | elapsed time per iteration (ms): 14464.9 | learning rate: 2.097E-05 | global batch size: 32 | lm loss: 6.381551E+00 | loss scale: 16384.0 | grad norm: 58297.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3844/ 159576 | consumed samples: 75760 | elapsed time per iteration (ms): 14942.4 | learning rate: 2.098E-05 | global batch size: 32 | lm loss: 6.471825E+00 | loss scale: 16384.0 | grad norm: 82881.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3845/ 159576 | consumed samples: 75792 | elapsed time per iteration (ms): 14531.3 | learning rate: 2.099E-05 | global batch size: 32 | lm loss: 6.528457E+00 | loss scale: 16384.0 | grad norm: 67296.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3846/ 159576 | consumed samples: 75824 | elapsed time per iteration (ms): 14601.9 | learning rate: 2.100E-05 | global batch size: 32 | lm loss: 6.408827E+00 | loss scale: 16384.0 | grad norm: 67512.624 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3847/ 159576 | consumed samples: 75856 | elapsed time per iteration (ms): 14580.2 | learning rate: 2.100E-05 | global batch size: 32 | lm loss: 6.440091E+00 | loss scale: 16384.0 | grad norm: 78400.656 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3848/ 159576 | consumed samples: 75888 | elapsed time per iteration (ms): 14911.9 | learning rate: 2.101E-05 | global batch size: 32 | lm loss: 6.374573E+00 | loss scale: 16384.0 | grad norm: 85886.969 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3849/ 159576 | consumed samples: 75920 | elapsed time per iteration (ms): 14768.3 | learning rate: 2.102E-05 | global batch size: 32 | lm loss: 6.529835E+00 | loss scale: 16384.0 | grad norm: 71394.057 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3850/ 159576 | consumed samples: 75952 | elapsed time per iteration (ms): 14553.3 | learning rate: 2.103E-05 | global batch size: 32 | lm loss: 6.455585E+00 | loss scale: 16384.0 | grad norm: 67772.089 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3851/ 159576 | consumed samples: 75984 | elapsed time per iteration (ms): 14574.9 | learning rate: 2.104E-05 | global batch size: 32 | lm loss: 6.428284E+00 | loss scale: 16384.0 | grad norm: 110864.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3852/ 159576 | consumed samples: 76016 | elapsed time per iteration (ms): 14592.6 | learning rate: 2.105E-05 | global batch size: 32 | lm loss: 6.457644E+00 | loss scale: 16384.0 | grad norm: 73499.592 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3853/ 159576 | consumed samples: 76048 | elapsed time per iteration (ms): 14780.7 | learning rate: 2.106E-05 | global batch size: 32 | lm loss: 6.459057E+00 | loss scale: 16384.0 | grad norm: 71503.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3854/ 159576 | consumed samples: 76080 | elapsed time per iteration (ms): 14631.9 | learning rate: 2.107E-05 | global batch size: 32 | lm loss: 6.522111E+00 | loss scale: 16384.0 | grad norm: 73205.829 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3855/ 159576 | consumed samples: 76112 | elapsed time per iteration (ms): 14685.7 | learning rate: 2.108E-05 | global batch size: 32 | lm loss: 6.444643E+00 | loss scale: 16384.0 | grad norm: 70169.559 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3856/ 159576 | consumed samples: 76144 | elapsed time per iteration (ms): 14534.2 | learning rate: 2.108E-05 | global batch size: 32 | lm loss: 6.392300E+00 | loss scale: 16384.0 | grad norm: 81224.688 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3857/ 159576 | consumed samples: 76176 | elapsed time per iteration (ms): 14734.9 | learning rate: 2.109E-05 | global batch size: 32 | lm loss: 6.474737E+00 | loss scale: 16384.0 | grad norm: 76429.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3858/ 159576 | consumed samples: 76208 | elapsed time per iteration (ms): 14589.1 | learning rate: 2.110E-05 | global batch size: 32 | lm loss: 6.481500E+00 | loss scale: 16384.0 | grad norm: 76288.617 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3859/ 159576 | consumed samples: 76240 | elapsed time per iteration (ms): 14536.6 | learning rate: 2.111E-05 | global batch size: 32 | lm loss: 6.504058E+00 | loss scale: 16384.0 | grad norm: 75104.955 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3860/ 159576 | consumed samples: 76272 | elapsed time per iteration (ms): 14557.4 | learning rate: 2.112E-05 | global batch size: 32 | lm loss: 6.616935E+00 | loss scale: 16384.0 | grad norm: 73471.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3861/ 159576 | consumed samples: 76304 | elapsed time per iteration (ms): 14996.3 | learning rate: 2.113E-05 | global batch size: 32 | lm loss: 6.437632E+00 | loss scale: 16384.0 | grad norm: 100626.814 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3862/ 159576 | consumed samples: 76336 | elapsed time per iteration (ms): 14610.8 | learning rate: 2.114E-05 | global batch size: 32 | lm loss: 6.358921E+00 | loss scale: 16384.0 | grad norm: 84367.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3863/ 159576 | consumed samples: 76368 | elapsed time per iteration (ms): 14574.0 | learning rate: 2.115E-05 | global batch size: 32 | lm loss: 6.489450E+00 | loss scale: 16384.0 | grad norm: 111308.083 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3864/ 159576 | consumed samples: 76400 | elapsed time per iteration (ms): 14585.8 | learning rate: 2.116E-05 | global batch size: 32 | lm loss: 6.579299E+00 | loss scale: 16384.0 | grad norm: 71685.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3865/ 159576 | consumed samples: 76432 | elapsed time per iteration (ms): 14801.5 | learning rate: 2.116E-05 | global batch size: 32 | lm loss: 6.356242E+00 | loss scale: 16384.0 | grad norm: 68636.493 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3866/ 159576 | consumed samples: 76464 | elapsed time per iteration (ms): 14581.8 | learning rate: 2.117E-05 | global batch size: 32 | lm loss: 6.583051E+00 | loss scale: 16384.0 | grad norm: 83498.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3867/ 159576 | consumed samples: 76496 | elapsed time per iteration (ms): 14548.1 | learning rate: 2.118E-05 | global batch size: 32 | lm loss: 6.414474E+00 | loss scale: 16384.0 | grad norm: 70120.527 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3868/ 159576 | consumed samples: 76528 | elapsed time per iteration (ms): 14581.2 | learning rate: 2.119E-05 | global batch size: 32 | lm loss: 6.383676E+00 | loss scale: 16384.0 | grad norm: 65625.290 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3869/ 159576 | consumed samples: 76560 | elapsed time per iteration (ms): 14975.0 | learning rate: 2.120E-05 | global batch size: 32 | lm loss: 6.553302E+00 | loss scale: 16384.0 | grad norm: 78443.319 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3870/ 159576 | consumed samples: 76592 | elapsed time per iteration (ms): 14654.1 | learning rate: 2.121E-05 | global batch size: 32 | lm loss: 6.525763E+00 | loss scale: 16384.0 | grad norm: 74575.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3871/ 159576 | consumed samples: 76624 | elapsed time per iteration (ms): 14658.5 | learning rate: 2.122E-05 | global batch size: 32 | lm loss: 6.416959E+00 | loss scale: 16384.0 | grad norm: 61001.593 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3872/ 159576 | consumed samples: 76656 | elapsed time per iteration (ms): 14544.3 | learning rate: 2.123E-05 | global batch size: 32 | lm loss: 6.516649E+00 | loss scale: 16384.0 | grad norm: 76582.538 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3873/ 159576 | consumed samples: 76688 | elapsed time per iteration (ms): 14961.2 | learning rate: 2.124E-05 | global batch size: 32 | lm loss: 6.532383E+00 | loss scale: 16384.0 | grad norm: 98540.585 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3874/ 159576 | consumed samples: 76720 | elapsed time per iteration (ms): 14595.7 | learning rate: 2.124E-05 | global batch size: 32 | lm loss: 6.589262E+00 | loss scale: 16384.0 | grad norm: 90020.937 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3875/ 159576 | consumed samples: 76752 | elapsed time per iteration (ms): 14549.8 | learning rate: 2.125E-05 | global batch size: 32 | lm loss: 6.475612E+00 | loss scale: 16384.0 | grad norm: 71253.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3876/ 159576 | consumed samples: 76784 | elapsed time per iteration (ms): 14539.7 | learning rate: 2.126E-05 | global batch size: 32 | lm loss: 6.477540E+00 | loss scale: 16384.0 | grad norm: 113904.264 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3877/ 159576 | consumed samples: 76816 | elapsed time per iteration (ms): 14922.4 | learning rate: 2.127E-05 | global batch size: 32 | lm loss: 6.475825E+00 | loss scale: 16384.0 | grad norm: 59736.077 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3878/ 159576 | consumed samples: 76848 | elapsed time per iteration (ms): 14676.0 | learning rate: 2.128E-05 | global batch size: 32 | lm loss: 6.477038E+00 | loss scale: 16384.0 | grad norm: 73926.427 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3879/ 159576 | consumed samples: 76880 | elapsed time per iteration (ms): 14505.4 | learning rate: 2.129E-05 | global batch size: 32 | lm loss: 6.577363E+00 | loss scale: 16384.0 | grad norm: 65273.771 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3880/ 159576 | consumed samples: 76912 | elapsed time per iteration (ms): 14525.2 | learning rate: 2.130E-05 | global batch size: 32 | lm loss: 6.431276E+00 | loss scale: 16384.0 | grad norm: 62353.041 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3881/ 159576 | consumed samples: 76944 | elapsed time per iteration (ms): 14918.9 | learning rate: 2.131E-05 | global batch size: 32 | lm loss: 6.471975E+00 | loss scale: 16384.0 | grad norm: 80402.399 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3882/ 159576 | consumed samples: 76976 | elapsed time per iteration (ms): 14543.5 | learning rate: 2.132E-05 | global batch size: 32 | lm loss: 6.481179E+00 | loss scale: 16384.0 | grad norm: 59241.446 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3883/ 159576 | consumed samples: 77008 | elapsed time per iteration (ms): 14519.1 | learning rate: 2.132E-05 | global batch size: 32 | lm loss: 6.356431E+00 | loss scale: 16384.0 | grad norm: 66124.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3884/ 159576 | consumed samples: 77040 | elapsed time per iteration (ms): 14635.6 | learning rate: 2.133E-05 | global batch size: 32 | lm loss: 7.171796E+00 | loss scale: 16384.0 | grad norm: 628102.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3885/ 159576 | consumed samples: 77072 | elapsed time per iteration (ms): 14877.6 | learning rate: 2.134E-05 | global batch size: 32 | lm loss: 7.122965E+00 | loss scale: 16384.0 | grad norm: 105361.079 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3886/ 159576 | consumed samples: 77104 | elapsed time per iteration (ms): 14581.7 | learning rate: 2.135E-05 | global batch size: 32 | lm loss: 6.781033E+00 | loss scale: 16384.0 | grad norm: 90805.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3887/ 159576 | consumed samples: 77136 | elapsed time per iteration (ms): 14580.5 | learning rate: 2.136E-05 | global batch size: 32 | lm loss: 6.824611E+00 | loss scale: 16384.0 | grad norm: 128888.283 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3888/ 159576 | consumed samples: 77168 | elapsed time per iteration (ms): 14468.4 | learning rate: 2.137E-05 | global batch size: 32 | lm loss: 6.773994E+00 | loss scale: 16384.0 | grad norm: 67441.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3889/ 159576 | consumed samples: 77200 | elapsed time per iteration (ms): 14934.3 | learning rate: 2.138E-05 | global batch size: 32 | lm loss: 6.845183E+00 | loss scale: 16384.0 | grad norm: 171660.767 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3890/ 159576 | consumed samples: 77232 | elapsed time per iteration (ms): 14531.8 | learning rate: 2.139E-05 | global batch size: 32 | lm loss: 6.803124E+00 | loss scale: 16384.0 | grad norm: 100767.890 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3891/ 159576 | consumed samples: 77264 | elapsed time per iteration (ms): 14568.7 | learning rate: 2.139E-05 | global batch size: 32 | lm loss: 6.825951E+00 | loss scale: 16384.0 | grad norm: 84326.742 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3892/ 159576 | consumed samples: 77296 | elapsed time per iteration (ms): 14543.8 | learning rate: 2.140E-05 | global batch size: 32 | lm loss: 6.734772E+00 | loss scale: 16384.0 | grad norm: 87236.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3893/ 159576 | consumed samples: 77328 | elapsed time per iteration (ms): 14607.7 | learning rate: 2.141E-05 | global batch size: 32 | lm loss: 6.789660E+00 | loss scale: 16384.0 | grad norm: 88054.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3894/ 159576 | consumed samples: 77360 | elapsed time per iteration (ms): 14920.9 | learning rate: 2.142E-05 | global batch size: 32 | lm loss: 6.710454E+00 | loss scale: 16384.0 | grad norm: 182978.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3895/ 159576 | consumed samples: 77392 | elapsed time per iteration (ms): 14510.2 | learning rate: 2.143E-05 | global batch size: 32 | lm loss: 6.691602E+00 | loss scale: 16384.0 | grad norm: 119037.944 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3896/ 159576 | consumed samples: 77424 | elapsed time per iteration (ms): 14496.2 | learning rate: 2.144E-05 | global batch size: 32 | lm loss: 6.739342E+00 | loss scale: 16384.0 | grad norm: 97461.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3897/ 159576 | consumed samples: 77456 | elapsed time per iteration (ms): 14526.7 | learning rate: 2.145E-05 | global batch size: 32 | lm loss: 6.818674E+00 | loss scale: 16384.0 | grad norm: 86334.005 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3898/ 159576 | consumed samples: 77488 | elapsed time per iteration (ms): 14792.9 | learning rate: 2.146E-05 | global batch size: 32 | lm loss: 6.717194E+00 | loss scale: 16384.0 | grad norm: 113951.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3899/ 159576 | consumed samples: 77520 | elapsed time per iteration (ms): 14491.5 | learning rate: 2.147E-05 | global batch size: 32 | lm loss: 6.714782E+00 | loss scale: 16384.0 | grad norm: 99766.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3900/ 159576 | consumed samples: 77552 | elapsed time per iteration (ms): 14584.1 | learning rate: 2.147E-05 | global batch size: 32 | lm loss: 6.659179E+00 | loss scale: 16384.0 | grad norm: 89663.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3901/ 159576 | consumed samples: 77584 | elapsed time per iteration (ms): 14629.2 | learning rate: 2.148E-05 | global batch size: 32 | lm loss: 6.615579E+00 | loss scale: 16384.0 | grad norm: 68957.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3902/ 159576 | consumed samples: 77616 | elapsed time per iteration (ms): 14617.9 | learning rate: 2.149E-05 | global batch size: 32 | lm loss: 6.606854E+00 | loss scale: 16384.0 | grad norm: 99968.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3903/ 159576 | consumed samples: 77648 | elapsed time per iteration (ms): 14554.1 | learning rate: 2.150E-05 | global batch size: 32 | lm loss: 6.537298E+00 | loss scale: 16384.0 | grad norm: 67921.849 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3904/ 159576 | consumed samples: 77680 | elapsed time per iteration (ms): 14545.4 | learning rate: 2.151E-05 | global batch size: 32 | lm loss: 6.606940E+00 | loss scale: 16384.0 | grad norm: 145573.785 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3905/ 159576 | consumed samples: 77712 | elapsed time per iteration (ms): 14521.9 | learning rate: 2.152E-05 | global batch size: 32 | lm loss: 6.625298E+00 | loss scale: 16384.0 | grad norm: 96778.059 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3906/ 159576 | consumed samples: 77744 | elapsed time per iteration (ms): 14699.2 | learning rate: 2.153E-05 | global batch size: 32 | lm loss: 6.624491E+00 | loss scale: 16384.0 | grad norm: 92738.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3907/ 159576 | consumed samples: 77776 | elapsed time per iteration (ms): 14558.6 | learning rate: 2.154E-05 | global batch size: 32 | lm loss: 6.825802E+00 | loss scale: 16384.0 | grad norm: 119492.559 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3908/ 159576 | consumed samples: 77808 | elapsed time per iteration (ms): 14547.7 | learning rate: 2.155E-05 | global batch size: 32 | lm loss: 6.591653E+00 | loss scale: 16384.0 | grad norm: 78761.796 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3909/ 159576 | consumed samples: 77840 | elapsed time per iteration (ms): 14554.0 | learning rate: 2.155E-05 | global batch size: 32 | lm loss: 6.567001E+00 | loss scale: 16384.0 | grad norm: 147075.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3910/ 159576 | consumed samples: 77872 | elapsed time per iteration (ms): 15013.4 | learning rate: 2.156E-05 | global batch size: 32 | lm loss: 6.787440E+00 | loss scale: 16384.0 | grad norm: 142314.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3911/ 159576 | consumed samples: 77904 | elapsed time per iteration (ms): 14566.2 | learning rate: 2.157E-05 | global batch size: 32 | lm loss: 6.525432E+00 | loss scale: 16384.0 | grad norm: 87369.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3912/ 159576 | consumed samples: 77936 | elapsed time per iteration (ms): 14516.0 | learning rate: 2.158E-05 | global batch size: 32 | lm loss: 6.615817E+00 | loss scale: 16384.0 | grad norm: 83904.990 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3913/ 159576 | consumed samples: 77968 | elapsed time per iteration (ms): 14525.8 | learning rate: 2.159E-05 | global batch size: 32 | lm loss: 6.564670E+00 | loss scale: 16384.0 | grad norm: 97516.560 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3914/ 159576 | consumed samples: 78000 | elapsed time per iteration (ms): 15027.0 | learning rate: 2.160E-05 | global batch size: 32 | lm loss: 6.400544E+00 | loss scale: 16384.0 | grad norm: 92743.388 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3915/ 159576 | consumed samples: 78032 | elapsed time per iteration (ms): 14573.6 | learning rate: 2.161E-05 | global batch size: 32 | lm loss: 6.603245E+00 | loss scale: 16384.0 | grad norm: 106541.895 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3916/ 159576 | consumed samples: 78064 | elapsed time per iteration (ms): 14538.9 | learning rate: 2.162E-05 | global batch size: 32 | lm loss: 6.560642E+00 | loss scale: 16384.0 | grad norm: 71313.618 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3917/ 159576 | consumed samples: 78096 | elapsed time per iteration (ms): 14550.2 | learning rate: 2.163E-05 | global batch size: 32 | lm loss: 6.578140E+00 | loss scale: 16384.0 | grad norm: 83812.809 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3918/ 159576 | consumed samples: 78128 | elapsed time per iteration (ms): 14857.6 | learning rate: 2.163E-05 | global batch size: 32 | lm loss: 6.583351E+00 | loss scale: 16384.0 | grad norm: 69616.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3919/ 159576 | consumed samples: 78160 | elapsed time per iteration (ms): 14509.2 | learning rate: 2.164E-05 | global batch size: 32 | lm loss: 6.595952E+00 | loss scale: 16384.0 | grad norm: 83133.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3920/ 159576 | consumed samples: 78192 | elapsed time per iteration (ms): 14502.7 | learning rate: 2.165E-05 | global batch size: 32 | lm loss: 6.645111E+00 | loss scale: 16384.0 | grad norm: 69570.909 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3921/ 159576 | consumed samples: 78224 | elapsed time per iteration (ms): 14498.8 | learning rate: 2.166E-05 | global batch size: 32 | lm loss: 6.553501E+00 | loss scale: 16384.0 | grad norm: 142896.192 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3922/ 159576 | consumed samples: 78256 | elapsed time per iteration (ms): 14842.1 | learning rate: 2.167E-05 | global batch size: 32 | lm loss: 6.687614E+00 | loss scale: 16384.0 | grad norm: 107346.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3923/ 159576 | consumed samples: 78288 | elapsed time per iteration (ms): 14567.6 | learning rate: 2.168E-05 | global batch size: 32 | lm loss: 6.764112E+00 | loss scale: 16384.0 | grad norm: 75484.388 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3924/ 159576 | consumed samples: 78320 | elapsed time per iteration (ms): 14603.6 | learning rate: 2.169E-05 | global batch size: 32 | lm loss: 6.384696E+00 | loss scale: 16384.0 | grad norm: 91570.469 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3925/ 159576 | consumed samples: 78352 | elapsed time per iteration (ms): 14494.1 | learning rate: 2.170E-05 | global batch size: 32 | lm loss: 6.148740E+00 | loss scale: 16384.0 | grad norm: 66094.874 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3926/ 159576 | consumed samples: 78384 | elapsed time per iteration (ms): 14880.0 | learning rate: 2.171E-05 | global batch size: 32 | lm loss: 6.492467E+00 | loss scale: 16384.0 | grad norm: 95980.364 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3927/ 159576 | consumed samples: 78416 | elapsed time per iteration (ms): 14529.0 | learning rate: 2.171E-05 | global batch size: 32 | lm loss: 6.634668E+00 | loss scale: 16384.0 | grad norm: 102240.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3928/ 159576 | consumed samples: 78448 | elapsed time per iteration (ms): 14524.9 | learning rate: 2.172E-05 | global batch size: 32 | lm loss: 6.542571E+00 | loss scale: 16384.0 | grad norm: 78190.337 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3929/ 159576 | consumed samples: 78480 | elapsed time per iteration (ms): 14519.9 | learning rate: 2.173E-05 | global batch size: 32 | lm loss: 6.546354E+00 | loss scale: 16384.0 | grad norm: 69181.655 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3930/ 159576 | consumed samples: 78512 | elapsed time per iteration (ms): 14848.7 | learning rate: 2.174E-05 | global batch size: 32 | lm loss: 6.556016E+00 | loss scale: 16384.0 | grad norm: 166890.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3931/ 159576 | consumed samples: 78544 | elapsed time per iteration (ms): 14630.3 | learning rate: 2.175E-05 | global batch size: 32 | lm loss: 6.575625E+00 | loss scale: 16384.0 | grad norm: 67026.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3932/ 159576 | consumed samples: 78576 | elapsed time per iteration (ms): 14503.2 | learning rate: 2.176E-05 | global batch size: 32 | lm loss: 6.528583E+00 | loss scale: 16384.0 | grad norm: 65300.446 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3933/ 159576 | consumed samples: 78608 | elapsed time per iteration (ms): 14533.6 | learning rate: 2.177E-05 | global batch size: 32 | lm loss: 6.571996E+00 | loss scale: 16384.0 | grad norm: 61530.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3934/ 159576 | consumed samples: 78640 | elapsed time per iteration (ms): 14528.2 | learning rate: 2.178E-05 | global batch size: 32 | lm loss: 6.524823E+00 | loss scale: 16384.0 | grad norm: 58107.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3935/ 159576 | consumed samples: 78672 | elapsed time per iteration (ms): 14801.4 | learning rate: 2.179E-05 | global batch size: 32 | lm loss: 6.627916E+00 | loss scale: 16384.0 | grad norm: 64798.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3936/ 159576 | consumed samples: 78704 | elapsed time per iteration (ms): 14509.3 | learning rate: 2.179E-05 | global batch size: 32 | lm loss: 6.511620E+00 | loss scale: 16384.0 | grad norm: 59258.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3937/ 159576 | consumed samples: 78736 | elapsed time per iteration (ms): 14529.7 | learning rate: 2.180E-05 | global batch size: 32 | lm loss: 6.414696E+00 | loss scale: 16384.0 | grad norm: 75598.973 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3938/ 159576 | consumed samples: 78768 | elapsed time per iteration (ms): 14568.6 | learning rate: 2.181E-05 | global batch size: 32 | lm loss: 6.692476E+00 | loss scale: 16384.0 | grad norm: 68594.644 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3939/ 159576 | consumed samples: 78800 | elapsed time per iteration (ms): 14680.0 | learning rate: 2.182E-05 | global batch size: 32 | lm loss: 6.509182E+00 | loss scale: 16384.0 | grad norm: 77431.860 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3940/ 159576 | consumed samples: 78832 | elapsed time per iteration (ms): 14561.3 | learning rate: 2.183E-05 | global batch size: 32 | lm loss: 6.521114E+00 | loss scale: 16384.0 | grad norm: 67107.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3941/ 159576 | consumed samples: 78864 | elapsed time per iteration (ms): 14540.3 | learning rate: 2.184E-05 | global batch size: 32 | lm loss: 6.557777E+00 | loss scale: 16384.0 | grad norm: 82252.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3942/ 159576 | consumed samples: 78896 | elapsed time per iteration (ms): 14516.4 | learning rate: 2.185E-05 | global batch size: 32 | lm loss: 6.519272E+00 | loss scale: 16384.0 | grad norm: 62956.678 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3943/ 159576 | consumed samples: 78928 | elapsed time per iteration (ms): 14804.0 | learning rate: 2.186E-05 | global batch size: 32 | lm loss: 6.436077E+00 | loss scale: 16384.0 | grad norm: 63372.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3944/ 159576 | consumed samples: 78960 | elapsed time per iteration (ms): 14504.5 | learning rate: 2.187E-05 | global batch size: 32 | lm loss: 6.536609E+00 | loss scale: 16384.0 | grad norm: 70623.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3945/ 159576 | consumed samples: 78992 | elapsed time per iteration (ms): 14519.8 | learning rate: 2.187E-05 | global batch size: 32 | lm loss: 6.631818E+00 | loss scale: 16384.0 | grad norm: 62267.463 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3946/ 159576 | consumed samples: 79024 | elapsed time per iteration (ms): 14592.1 | learning rate: 2.188E-05 | global batch size: 32 | lm loss: 6.263665E+00 | loss scale: 16384.0 | grad norm: 67107.842 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3947/ 159576 | consumed samples: 79056 | elapsed time per iteration (ms): 14791.6 | learning rate: 2.189E-05 | global batch size: 32 | lm loss: 6.622372E+00 | loss scale: 16384.0 | grad norm: 84764.799 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3948/ 159576 | consumed samples: 79088 | elapsed time per iteration (ms): 14637.3 | learning rate: 2.190E-05 | global batch size: 32 | lm loss: 6.395759E+00 | loss scale: 16384.0 | grad norm: 60113.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3949/ 159576 | consumed samples: 79120 | elapsed time per iteration (ms): 14546.6 | learning rate: 2.191E-05 | global batch size: 32 | lm loss: 6.588756E+00 | loss scale: 16384.0 | grad norm: 68679.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3950/ 159576 | consumed samples: 79152 | elapsed time per iteration (ms): 14514.6 | learning rate: 2.192E-05 | global batch size: 32 | lm loss: 6.484011E+00 | loss scale: 16384.0 | grad norm: 68729.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3951/ 159576 | consumed samples: 79184 | elapsed time per iteration (ms): 14907.8 | learning rate: 2.193E-05 | global batch size: 32 | lm loss: 6.496289E+00 | loss scale: 16384.0 | grad norm: 58918.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3952/ 159576 | consumed samples: 79216 | elapsed time per iteration (ms): 14467.7 | learning rate: 2.194E-05 | global batch size: 32 | lm loss: 6.442475E+00 | loss scale: 16384.0 | grad norm: 73240.452 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3953/ 159576 | consumed samples: 79248 | elapsed time per iteration (ms): 14613.3 | learning rate: 2.195E-05 | global batch size: 32 | lm loss: 6.412640E+00 | loss scale: 16384.0 | grad norm: 63495.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3954/ 159576 | consumed samples: 79280 | elapsed time per iteration (ms): 14497.1 | learning rate: 2.195E-05 | global batch size: 32 | lm loss: 6.419092E+00 | loss scale: 16384.0 | grad norm: 64832.581 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3955/ 159576 | consumed samples: 79312 | elapsed time per iteration (ms): 14864.8 | learning rate: 2.196E-05 | global batch size: 32 | lm loss: 6.411493E+00 | loss scale: 16384.0 | grad norm: 70227.738 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3956/ 159576 | consumed samples: 79344 | elapsed time per iteration (ms): 14501.1 | learning rate: 2.197E-05 | global batch size: 32 | lm loss: 6.377773E+00 | loss scale: 16384.0 | grad norm: 65521.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3957/ 159576 | consumed samples: 79376 | elapsed time per iteration (ms): 14522.7 | learning rate: 2.198E-05 | global batch size: 32 | lm loss: 6.458980E+00 | loss scale: 16384.0 | grad norm: 62294.197 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3958/ 159576 | consumed samples: 79408 | elapsed time per iteration (ms): 14509.2 | learning rate: 2.199E-05 | global batch size: 32 | lm loss: 6.540348E+00 | loss scale: 16384.0 | grad norm: 64994.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3959/ 159576 | consumed samples: 79440 | elapsed time per iteration (ms): 14868.7 | learning rate: 2.200E-05 | global batch size: 32 | lm loss: 6.503858E+00 | loss scale: 16384.0 | grad norm: 54271.909 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3960/ 159576 | consumed samples: 79472 | elapsed time per iteration (ms): 14512.5 | learning rate: 2.201E-05 | global batch size: 32 | lm loss: 6.372645E+00 | loss scale: 16384.0 | grad norm: 73237.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3961/ 159576 | consumed samples: 79504 | elapsed time per iteration (ms): 14552.3 | learning rate: 2.202E-05 | global batch size: 32 | lm loss: 6.396554E+00 | loss scale: 16384.0 | grad norm: 64579.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3962/ 159576 | consumed samples: 79536 | elapsed time per iteration (ms): 14559.3 | learning rate: 2.203E-05 | global batch size: 32 | lm loss: 6.556979E+00 | loss scale: 16384.0 | grad norm: 83489.476 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3963/ 159576 | consumed samples: 79568 | elapsed time per iteration (ms): 14899.9 | learning rate: 2.203E-05 | global batch size: 32 | lm loss: 6.458327E+00 | loss scale: 16384.0 | grad norm: 58716.823 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3964/ 159576 | consumed samples: 79600 | elapsed time per iteration (ms): 14539.5 | learning rate: 2.204E-05 | global batch size: 32 | lm loss: 6.802517E+00 | loss scale: 16384.0 | grad norm: 60731.153 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3965/ 159576 | consumed samples: 79632 | elapsed time per iteration (ms): 14520.1 | learning rate: 2.205E-05 | global batch size: 32 | lm loss: 6.616902E+00 | loss scale: 16384.0 | grad norm: 64155.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3966/ 159576 | consumed samples: 79664 | elapsed time per iteration (ms): 14585.2 | learning rate: 2.206E-05 | global batch size: 32 | lm loss: 6.457995E+00 | loss scale: 16384.0 | grad norm: 74880.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3967/ 159576 | consumed samples: 79696 | elapsed time per iteration (ms): 14850.0 | learning rate: 2.207E-05 | global batch size: 32 | lm loss: 6.591904E+00 | loss scale: 16384.0 | grad norm: 75336.614 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3968/ 159576 | consumed samples: 79728 | elapsed time per iteration (ms): 14661.7 | learning rate: 2.208E-05 | global batch size: 32 | lm loss: 6.475752E+00 | loss scale: 16384.0 | grad norm: 76852.677 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3969/ 159576 | consumed samples: 79760 | elapsed time per iteration (ms): 14523.7 | learning rate: 2.209E-05 | global batch size: 32 | lm loss: 6.452621E+00 | loss scale: 16384.0 | grad norm: 65844.475 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3970/ 159576 | consumed samples: 79792 | elapsed time per iteration (ms): 14549.1 | learning rate: 2.210E-05 | global batch size: 32 | lm loss: 6.401618E+00 | loss scale: 16384.0 | grad norm: 84954.581 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3971/ 159576 | consumed samples: 79824 | elapsed time per iteration (ms): 14508.8 | learning rate: 2.211E-05 | global batch size: 32 | lm loss: 6.516178E+00 | loss scale: 16384.0 | grad norm: 71111.037 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3972/ 159576 | consumed samples: 79856 | elapsed time per iteration (ms): 14847.5 | learning rate: 2.211E-05 | global batch size: 32 | lm loss: 6.601567E+00 | loss scale: 16384.0 | grad norm: 74563.765 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3973/ 159576 | consumed samples: 79888 | elapsed time per iteration (ms): 14594.0 | learning rate: 2.212E-05 | global batch size: 32 | lm loss: 6.441951E+00 | loss scale: 16384.0 | grad norm: 72653.525 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3974/ 159576 | consumed samples: 79920 | elapsed time per iteration (ms): 14478.4 | learning rate: 2.213E-05 | global batch size: 32 | lm loss: 6.510294E+00 | loss scale: 16384.0 | grad norm: 65083.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3975/ 159576 | consumed samples: 79952 | elapsed time per iteration (ms): 14520.1 | learning rate: 2.214E-05 | global batch size: 32 | lm loss: 6.345959E+00 | loss scale: 16384.0 | grad norm: 133600.019 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3976/ 159576 | consumed samples: 79984 | elapsed time per iteration (ms): 14770.3 | learning rate: 2.215E-05 | global batch size: 32 | lm loss: 6.477483E+00 | loss scale: 16384.0 | grad norm: 89443.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3977/ 159576 | consumed samples: 80016 | elapsed time per iteration (ms): 14483.7 | learning rate: 2.216E-05 | global batch size: 32 | lm loss: 6.466526E+00 | loss scale: 16384.0 | grad norm: 79203.283 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3978/ 159576 | consumed samples: 80048 | elapsed time per iteration (ms): 14548.9 | learning rate: 2.217E-05 | global batch size: 32 | lm loss: 6.490917E+00 | loss scale: 16384.0 | grad norm: 85035.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3979/ 159576 | consumed samples: 80080 | elapsed time per iteration (ms): 14519.8 | learning rate: 2.218E-05 | global batch size: 32 | lm loss: 6.412145E+00 | loss scale: 16384.0 | grad norm: 93580.388 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3980/ 159576 | consumed samples: 80112 | elapsed time per iteration (ms): 14659.7 | learning rate: 2.218E-05 | global batch size: 32 | lm loss: 6.473646E+00 | loss scale: 16384.0 | grad norm: 79422.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3981/ 159576 | consumed samples: 80144 | elapsed time per iteration (ms): 14525.1 | learning rate: 2.219E-05 | global batch size: 32 | lm loss: 6.522334E+00 | loss scale: 16384.0 | grad norm: 83533.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3982/ 159576 | consumed samples: 80176 | elapsed time per iteration (ms): 14543.1 | learning rate: 2.220E-05 | global batch size: 32 | lm loss: 6.387228E+00 | loss scale: 16384.0 | grad norm: 89795.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3983/ 159576 | consumed samples: 80208 | elapsed time per iteration (ms): 14609.8 | learning rate: 2.221E-05 | global batch size: 32 | lm loss: 6.475267E+00 | loss scale: 16384.0 | grad norm: 119598.589 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3984/ 159576 | consumed samples: 80240 | elapsed time per iteration (ms): 14596.2 | learning rate: 2.222E-05 | global batch size: 32 | lm loss: 6.533351E+00 | loss scale: 16384.0 | grad norm: 72306.036 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3985/ 159576 | consumed samples: 80272 | elapsed time per iteration (ms): 14621.5 | learning rate: 2.223E-05 | global batch size: 32 | lm loss: 6.540237E+00 | loss scale: 16384.0 | grad norm: 88358.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3986/ 159576 | consumed samples: 80304 | elapsed time per iteration (ms): 14563.8 | learning rate: 2.224E-05 | global batch size: 32 | lm loss: 6.419699E+00 | loss scale: 16384.0 | grad norm: 75411.849 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3987/ 159576 | consumed samples: 80336 | elapsed time per iteration (ms): 14555.9 | learning rate: 2.225E-05 | global batch size: 32 | lm loss: 6.591748E+00 | loss scale: 16384.0 | grad norm: 112139.715 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3988/ 159576 | consumed samples: 80368 | elapsed time per iteration (ms): 15004.4 | learning rate: 2.226E-05 | global batch size: 32 | lm loss: 6.551664E+00 | loss scale: 16384.0 | grad norm: 88397.931 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3989/ 159576 | consumed samples: 80400 | elapsed time per iteration (ms): 14610.9 | learning rate: 2.226E-05 | global batch size: 32 | lm loss: 6.531049E+00 | loss scale: 16384.0 | grad norm: 63924.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3990/ 159576 | consumed samples: 80432 | elapsed time per iteration (ms): 14532.5 | learning rate: 2.227E-05 | global batch size: 32 | lm loss: 6.546918E+00 | loss scale: 16384.0 | grad norm: 97299.376 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3991/ 159576 | consumed samples: 80464 | elapsed time per iteration (ms): 14437.4 | learning rate: 2.228E-05 | global batch size: 32 | lm loss: 6.471569E+00 | loss scale: 16384.0 | grad norm: 76326.402 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3992/ 159576 | consumed samples: 80496 | elapsed time per iteration (ms): 14906.8 | learning rate: 2.229E-05 | global batch size: 32 | lm loss: 6.525407E+00 | loss scale: 16384.0 | grad norm: 77183.511 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3993/ 159576 | consumed samples: 80528 | elapsed time per iteration (ms): 14534.2 | learning rate: 2.230E-05 | global batch size: 32 | lm loss: 6.539597E+00 | loss scale: 16384.0 | grad norm: 60376.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3994/ 159576 | consumed samples: 80560 | elapsed time per iteration (ms): 14579.3 | learning rate: 2.231E-05 | global batch size: 32 | lm loss: 6.552666E+00 | loss scale: 16384.0 | grad norm: 84746.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3995/ 159576 | consumed samples: 80592 | elapsed time per iteration (ms): 14529.3 | learning rate: 2.232E-05 | global batch size: 32 | lm loss: 6.413946E+00 | loss scale: 16384.0 | grad norm: 67969.641 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3996/ 159576 | consumed samples: 80624 | elapsed time per iteration (ms): 14922.8 | learning rate: 2.233E-05 | global batch size: 32 | lm loss: 6.427969E+00 | loss scale: 16384.0 | grad norm: 69649.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3997/ 159576 | consumed samples: 80656 | elapsed time per iteration (ms): 14490.2 | learning rate: 2.234E-05 | global batch size: 32 | lm loss: 6.527985E+00 | loss scale: 16384.0 | grad norm: 71644.429 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3998/ 159576 | consumed samples: 80688 | elapsed time per iteration (ms): 14523.9 | learning rate: 2.234E-05 | global batch size: 32 | lm loss: 6.565372E+00 | loss scale: 16384.0 | grad norm: 75823.635 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 3999/ 159576 | consumed samples: 80720 | elapsed time per iteration (ms): 14567.0 | learning rate: 2.235E-05 | global batch size: 32 | lm loss: 6.377391E+00 | loss scale: 16384.0 | grad norm: 85967.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 17:49:17,238] [INFO] [logging.py:68:log_dist] [Rank 0] step=4000, skipped=6, lr=[2.2362426035502958e-05, 2.2362426035502958e-05], mom=[(0.9, 0.999), (0.9, 0.999)] -steps: 4000 loss: 6.3572 iter time (s): 0.007 samples/sec: 4556.985 - iteration 4000/ 159576 | consumed samples: 80752 | elapsed time per iteration (ms): 14858.7 | learning rate: 2.236E-05 | global batch size: 32 | lm loss: 6.357164E+00 | loss scale: 16384.0 | grad norm: 56878.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 4000 | lm loss value: 6.453290E+00 | lm loss PPL: 6.347876E+02 | ------------------------------------------------------------------------------------------------- - iteration 4001/ 159576 | consumed samples: 80784 | elapsed time per iteration (ms): 20796.3 | learning rate: 2.237E-05 | global batch size: 32 | lm loss: 6.357805E+00 | loss scale: 16384.0 | grad norm: 75271.208 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4002/ 159576 | consumed samples: 80816 | elapsed time per iteration (ms): 14528.3 | learning rate: 2.238E-05 | global batch size: 32 | lm loss: 6.590372E+00 | loss scale: 16384.0 | grad norm: 82823.216 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4003/ 159576 | consumed samples: 80848 | elapsed time per iteration (ms): 14569.0 | learning rate: 2.239E-05 | global batch size: 32 | lm loss: 6.547601E+00 | loss scale: 16384.0 | grad norm: 63495.848 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4004/ 159576 | consumed samples: 80880 | elapsed time per iteration (ms): 14981.7 | learning rate: 2.240E-05 | global batch size: 32 | lm loss: 6.488581E+00 | loss scale: 16384.0 | grad norm: 84538.823 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4005/ 159576 | consumed samples: 80912 | elapsed time per iteration (ms): 14517.6 | learning rate: 2.241E-05 | global batch size: 32 | lm loss: 6.473035E+00 | loss scale: 16384.0 | grad norm: 69154.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4006/ 159576 | consumed samples: 80944 | elapsed time per iteration (ms): 14515.3 | learning rate: 2.242E-05 | global batch size: 32 | lm loss: 6.574604E+00 | loss scale: 16384.0 | grad norm: 71258.786 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4007/ 159576 | consumed samples: 80976 | elapsed time per iteration (ms): 14530.3 | learning rate: 2.242E-05 | global batch size: 32 | lm loss: 6.480978E+00 | loss scale: 16384.0 | grad norm: 63598.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4008/ 159576 | consumed samples: 81008 | elapsed time per iteration (ms): 15052.4 | learning rate: 2.243E-05 | global batch size: 32 | lm loss: 6.393389E+00 | loss scale: 16384.0 | grad norm: 76474.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4009/ 159576 | consumed samples: 81040 | elapsed time per iteration (ms): 14618.9 | learning rate: 2.244E-05 | global batch size: 32 | lm loss: 6.322450E+00 | loss scale: 16384.0 | grad norm: 62736.146 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4010/ 159576 | consumed samples: 81072 | elapsed time per iteration (ms): 14521.7 | learning rate: 2.245E-05 | global batch size: 32 | lm loss: 6.502364E+00 | loss scale: 16384.0 | grad norm: 78751.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4011/ 159576 | consumed samples: 81104 | elapsed time per iteration (ms): 14513.4 | learning rate: 2.246E-05 | global batch size: 32 | lm loss: 6.504915E+00 | loss scale: 16384.0 | grad norm: 73290.420 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4012/ 159576 | consumed samples: 81136 | elapsed time per iteration (ms): 14859.5 | learning rate: 2.247E-05 | global batch size: 32 | lm loss: 6.422670E+00 | loss scale: 16384.0 | grad norm: 70911.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4013/ 159576 | consumed samples: 81168 | elapsed time per iteration (ms): 14562.7 | learning rate: 2.248E-05 | global batch size: 32 | lm loss: 6.460926E+00 | loss scale: 16384.0 | grad norm: 88361.679 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4014/ 159576 | consumed samples: 81200 | elapsed time per iteration (ms): 14537.6 | learning rate: 2.249E-05 | global batch size: 32 | lm loss: 6.359708E+00 | loss scale: 16384.0 | grad norm: 70950.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4015/ 159576 | consumed samples: 81232 | elapsed time per iteration (ms): 14575.5 | learning rate: 2.250E-05 | global batch size: 32 | lm loss: 6.479752E+00 | loss scale: 16384.0 | grad norm: 60916.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4016/ 159576 | consumed samples: 81264 | elapsed time per iteration (ms): 14890.4 | learning rate: 2.250E-05 | global batch size: 32 | lm loss: 6.438080E+00 | loss scale: 16384.0 | grad norm: 78503.860 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4017/ 159576 | consumed samples: 81296 | elapsed time per iteration (ms): 14519.4 | learning rate: 2.251E-05 | global batch size: 32 | lm loss: 6.446492E+00 | loss scale: 16384.0 | grad norm: 66299.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4018/ 159576 | consumed samples: 81328 | elapsed time per iteration (ms): 14512.9 | learning rate: 2.252E-05 | global batch size: 32 | lm loss: 6.418320E+00 | loss scale: 16384.0 | grad norm: 65936.043 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4019/ 159576 | consumed samples: 81360 | elapsed time per iteration (ms): 14568.1 | learning rate: 2.253E-05 | global batch size: 32 | lm loss: 6.337445E+00 | loss scale: 16384.0 | grad norm: 71727.512 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4020/ 159576 | consumed samples: 81392 | elapsed time per iteration (ms): 14867.3 | learning rate: 2.254E-05 | global batch size: 32 | lm loss: 6.564549E+00 | loss scale: 16384.0 | grad norm: 96122.107 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4021/ 159576 | consumed samples: 81424 | elapsed time per iteration (ms): 14435.4 | learning rate: 2.255E-05 | global batch size: 32 | lm loss: 6.485852E+00 | loss scale: 16384.0 | grad norm: 82597.736 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4022/ 159576 | consumed samples: 81456 | elapsed time per iteration (ms): 14558.0 | learning rate: 2.256E-05 | global batch size: 32 | lm loss: 6.539099E+00 | loss scale: 16384.0 | grad norm: 121006.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4023/ 159576 | consumed samples: 81488 | elapsed time per iteration (ms): 14530.8 | learning rate: 2.257E-05 | global batch size: 32 | lm loss: 6.588836E+00 | loss scale: 16384.0 | grad norm: 83990.530 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4024/ 159576 | consumed samples: 81520 | elapsed time per iteration (ms): 14903.1 | learning rate: 2.258E-05 | global batch size: 32 | lm loss: 6.478038E+00 | loss scale: 16384.0 | grad norm: 86310.728 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4025/ 159576 | consumed samples: 81552 | elapsed time per iteration (ms): 14640.8 | learning rate: 2.258E-05 | global batch size: 32 | lm loss: 6.423618E+00 | loss scale: 16384.0 | grad norm: 72646.553 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4026/ 159576 | consumed samples: 81584 | elapsed time per iteration (ms): 14523.1 | learning rate: 2.259E-05 | global batch size: 32 | lm loss: 6.389876E+00 | loss scale: 16384.0 | grad norm: 75260.682 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4027/ 159576 | consumed samples: 81616 | elapsed time per iteration (ms): 14495.3 | learning rate: 2.260E-05 | global batch size: 32 | lm loss: 6.686980E+00 | loss scale: 16384.0 | grad norm: 68901.893 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4028/ 159576 | consumed samples: 81648 | elapsed time per iteration (ms): 14518.7 | learning rate: 2.261E-05 | global batch size: 32 | lm loss: 6.454273E+00 | loss scale: 16384.0 | grad norm: 78058.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4029/ 159576 | consumed samples: 81680 | elapsed time per iteration (ms): 14751.7 | learning rate: 2.262E-05 | global batch size: 32 | lm loss: 6.645922E+00 | loss scale: 16384.0 | grad norm: 90877.563 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4030/ 159576 | consumed samples: 81712 | elapsed time per iteration (ms): 14605.8 | learning rate: 2.263E-05 | global batch size: 32 | lm loss: 6.554152E+00 | loss scale: 16384.0 | grad norm: 71333.048 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4031/ 159576 | consumed samples: 81744 | elapsed time per iteration (ms): 14567.0 | learning rate: 2.264E-05 | global batch size: 32 | lm loss: 6.512757E+00 | loss scale: 16384.0 | grad norm: 75409.197 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4032/ 159576 | consumed samples: 81776 | elapsed time per iteration (ms): 14627.7 | learning rate: 2.265E-05 | global batch size: 32 | lm loss: 6.529600E+00 | loss scale: 16384.0 | grad norm: 83852.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4033/ 159576 | consumed samples: 81808 | elapsed time per iteration (ms): 14706.7 | learning rate: 2.266E-05 | global batch size: 32 | lm loss: 6.312231E+00 | loss scale: 16384.0 | grad norm: 64610.818 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4034/ 159576 | consumed samples: 81840 | elapsed time per iteration (ms): 14453.1 | learning rate: 2.266E-05 | global batch size: 32 | lm loss: 6.378237E+00 | loss scale: 16384.0 | grad norm: 70363.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4035/ 159576 | consumed samples: 81872 | elapsed time per iteration (ms): 14558.4 | learning rate: 2.267E-05 | global batch size: 32 | lm loss: 6.617406E+00 | loss scale: 16384.0 | grad norm: 76776.869 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4036/ 159576 | consumed samples: 81904 | elapsed time per iteration (ms): 14451.4 | learning rate: 2.268E-05 | global batch size: 32 | lm loss: 6.510260E+00 | loss scale: 16384.0 | grad norm: 65763.594 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4037/ 159576 | consumed samples: 81936 | elapsed time per iteration (ms): 14734.4 | learning rate: 2.269E-05 | global batch size: 32 | lm loss: 6.484540E+00 | loss scale: 16384.0 | grad norm: 113964.842 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4038/ 159576 | consumed samples: 81968 | elapsed time per iteration (ms): 14560.9 | learning rate: 2.270E-05 | global batch size: 32 | lm loss: 6.422564E+00 | loss scale: 16384.0 | grad norm: 71196.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4039/ 159576 | consumed samples: 82000 | elapsed time per iteration (ms): 14521.4 | learning rate: 2.271E-05 | global batch size: 32 | lm loss: 6.468810E+00 | loss scale: 16384.0 | grad norm: 81464.635 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4040/ 159576 | consumed samples: 82032 | elapsed time per iteration (ms): 14534.9 | learning rate: 2.272E-05 | global batch size: 32 | lm loss: 6.528829E+00 | loss scale: 16384.0 | grad norm: 64883.399 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4041/ 159576 | consumed samples: 82064 | elapsed time per iteration (ms): 14840.7 | learning rate: 2.273E-05 | global batch size: 32 | lm loss: 6.466451E+00 | loss scale: 16384.0 | grad norm: 113319.594 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4042/ 159576 | consumed samples: 82096 | elapsed time per iteration (ms): 14627.3 | learning rate: 2.274E-05 | global batch size: 32 | lm loss: 6.455089E+00 | loss scale: 16384.0 | grad norm: 63704.855 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4043/ 159576 | consumed samples: 82128 | elapsed time per iteration (ms): 14401.0 | learning rate: 2.274E-05 | global batch size: 32 | lm loss: 6.394213E+00 | loss scale: 16384.0 | grad norm: 104510.525 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4044/ 159576 | consumed samples: 82160 | elapsed time per iteration (ms): 14522.2 | learning rate: 2.275E-05 | global batch size: 32 | lm loss: 6.436733E+00 | loss scale: 16384.0 | grad norm: 69916.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4045/ 159576 | consumed samples: 82192 | elapsed time per iteration (ms): 14878.3 | learning rate: 2.276E-05 | global batch size: 32 | lm loss: 6.467334E+00 | loss scale: 16384.0 | grad norm: 86814.439 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4046/ 159576 | consumed samples: 82224 | elapsed time per iteration (ms): 14619.5 | learning rate: 2.277E-05 | global batch size: 32 | lm loss: 6.542828E+00 | loss scale: 16384.0 | grad norm: 91169.836 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4047/ 159576 | consumed samples: 82256 | elapsed time per iteration (ms): 14546.0 | learning rate: 2.278E-05 | global batch size: 32 | lm loss: 6.482902E+00 | loss scale: 16384.0 | grad norm: 71855.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4048/ 159576 | consumed samples: 82288 | elapsed time per iteration (ms): 14535.3 | learning rate: 2.279E-05 | global batch size: 32 | lm loss: 6.380974E+00 | loss scale: 16384.0 | grad norm: 110448.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4049/ 159576 | consumed samples: 82320 | elapsed time per iteration (ms): 14946.7 | learning rate: 2.280E-05 | global batch size: 32 | lm loss: 6.604033E+00 | loss scale: 16384.0 | grad norm: 86973.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4050/ 159576 | consumed samples: 82352 | elapsed time per iteration (ms): 14452.3 | learning rate: 2.281E-05 | global batch size: 32 | lm loss: 6.485418E+00 | loss scale: 16384.0 | grad norm: 93547.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4051/ 159576 | consumed samples: 82384 | elapsed time per iteration (ms): 14486.7 | learning rate: 2.282E-05 | global batch size: 32 | lm loss: 6.447795E+00 | loss scale: 16384.0 | grad norm: 71623.174 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4052/ 159576 | consumed samples: 82416 | elapsed time per iteration (ms): 14546.0 | learning rate: 2.282E-05 | global batch size: 32 | lm loss: 6.490433E+00 | loss scale: 16384.0 | grad norm: 122748.723 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4053/ 159576 | consumed samples: 82448 | elapsed time per iteration (ms): 14923.8 | learning rate: 2.283E-05 | global batch size: 32 | lm loss: 6.393107E+00 | loss scale: 16384.0 | grad norm: 94716.038 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4054/ 159576 | consumed samples: 82480 | elapsed time per iteration (ms): 14522.3 | learning rate: 2.284E-05 | global batch size: 32 | lm loss: 6.560749E+00 | loss scale: 16384.0 | grad norm: 87911.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4055/ 159576 | consumed samples: 82512 | elapsed time per iteration (ms): 14576.1 | learning rate: 2.285E-05 | global batch size: 32 | lm loss: 6.508199E+00 | loss scale: 16384.0 | grad norm: 75712.942 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4056/ 159576 | consumed samples: 82544 | elapsed time per iteration (ms): 14509.2 | learning rate: 2.286E-05 | global batch size: 32 | lm loss: 6.480619E+00 | loss scale: 16384.0 | grad norm: 92968.738 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4057/ 159576 | consumed samples: 82576 | elapsed time per iteration (ms): 14814.4 | learning rate: 2.287E-05 | global batch size: 32 | lm loss: 6.324226E+00 | loss scale: 16384.0 | grad norm: 78472.900 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4058/ 159576 | consumed samples: 82608 | elapsed time per iteration (ms): 14459.3 | learning rate: 2.288E-05 | global batch size: 32 | lm loss: 6.626959E+00 | loss scale: 16384.0 | grad norm: 80531.732 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4059/ 159576 | consumed samples: 82640 | elapsed time per iteration (ms): 14496.4 | learning rate: 2.289E-05 | global batch size: 32 | lm loss: 6.406682E+00 | loss scale: 16384.0 | grad norm: 75308.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4060/ 159576 | consumed samples: 82672 | elapsed time per iteration (ms): 14562.2 | learning rate: 2.289E-05 | global batch size: 32 | lm loss: 6.440542E+00 | loss scale: 16384.0 | grad norm: 78114.884 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4061/ 159576 | consumed samples: 82704 | elapsed time per iteration (ms): 14796.0 | learning rate: 2.290E-05 | global batch size: 32 | lm loss: 6.468933E+00 | loss scale: 16384.0 | grad norm: 77154.286 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4062/ 159576 | consumed samples: 82736 | elapsed time per iteration (ms): 14696.5 | learning rate: 2.291E-05 | global batch size: 32 | lm loss: 6.318196E+00 | loss scale: 16384.0 | grad norm: 97551.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4063/ 159576 | consumed samples: 82768 | elapsed time per iteration (ms): 14468.1 | learning rate: 2.292E-05 | global batch size: 32 | lm loss: 6.472930E+00 | loss scale: 16384.0 | grad norm: 110041.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4064/ 159576 | consumed samples: 82800 | elapsed time per iteration (ms): 14496.2 | learning rate: 2.293E-05 | global batch size: 32 | lm loss: 6.523721E+00 | loss scale: 16384.0 | grad norm: 88018.768 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4065/ 159576 | consumed samples: 82832 | elapsed time per iteration (ms): 14563.8 | learning rate: 2.294E-05 | global batch size: 32 | lm loss: 6.453180E+00 | loss scale: 16384.0 | grad norm: 83087.922 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4066/ 159576 | consumed samples: 82864 | elapsed time per iteration (ms): 14884.4 | learning rate: 2.295E-05 | global batch size: 32 | lm loss: 6.447326E+00 | loss scale: 16384.0 | grad norm: 72433.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4067/ 159576 | consumed samples: 82896 | elapsed time per iteration (ms): 14491.5 | learning rate: 2.296E-05 | global batch size: 32 | lm loss: 6.366633E+00 | loss scale: 16384.0 | grad norm: 100504.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4068/ 159576 | consumed samples: 82928 | elapsed time per iteration (ms): 14561.6 | learning rate: 2.297E-05 | global batch size: 32 | lm loss: 6.315294E+00 | loss scale: 16384.0 | grad norm: 79868.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4069/ 159576 | consumed samples: 82960 | elapsed time per iteration (ms): 14538.6 | learning rate: 2.297E-05 | global batch size: 32 | lm loss: 6.452709E+00 | loss scale: 16384.0 | grad norm: 94073.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4070/ 159576 | consumed samples: 82992 | elapsed time per iteration (ms): 14651.1 | learning rate: 2.298E-05 | global batch size: 32 | lm loss: 6.421084E+00 | loss scale: 16384.0 | grad norm: 96558.906 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4071/ 159576 | consumed samples: 83024 | elapsed time per iteration (ms): 14508.0 | learning rate: 2.299E-05 | global batch size: 32 | lm loss: 6.474918E+00 | loss scale: 16384.0 | grad norm: 104437.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4072/ 159576 | consumed samples: 83056 | elapsed time per iteration (ms): 14540.3 | learning rate: 2.300E-05 | global batch size: 32 | lm loss: 6.442264E+00 | loss scale: 16384.0 | grad norm: 69985.883 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 18:07:07] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 18:07:07] PULSE: tr8-104B is running for 12:14:56 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 4073/ 159576 | consumed samples: 83088 | elapsed time per iteration (ms): 14430.9 | learning rate: 2.301E-05 | global batch size: 32 | lm loss: 6.464416E+00 | loss scale: 16384.0 | grad norm: 92935.764 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4074/ 159576 | consumed samples: 83120 | elapsed time per iteration (ms): 14595.5 | learning rate: 2.302E-05 | global batch size: 32 | lm loss: 6.394172E+00 | loss scale: 16384.0 | grad norm: 93727.497 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4075/ 159576 | consumed samples: 83152 | elapsed time per iteration (ms): 14478.6 | learning rate: 2.303E-05 | global batch size: 32 | lm loss: 6.535138E+00 | loss scale: 16384.0 | grad norm: 110910.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4076/ 159576 | consumed samples: 83184 | elapsed time per iteration (ms): 14559.7 | learning rate: 2.304E-05 | global batch size: 32 | lm loss: 6.459756E+00 | loss scale: 16384.0 | grad norm: 79798.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4077/ 159576 | consumed samples: 83216 | elapsed time per iteration (ms): 14529.0 | learning rate: 2.305E-05 | global batch size: 32 | lm loss: 6.388766E+00 | loss scale: 16384.0 | grad norm: 80153.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4078/ 159576 | consumed samples: 83248 | elapsed time per iteration (ms): 15028.3 | learning rate: 2.305E-05 | global batch size: 32 | lm loss: 6.462305E+00 | loss scale: 16384.0 | grad norm: 72541.670 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4079/ 159576 | consumed samples: 83280 | elapsed time per iteration (ms): 14501.7 | learning rate: 2.306E-05 | global batch size: 32 | lm loss: 6.606649E+00 | loss scale: 16384.0 | grad norm: 72682.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4080/ 159576 | consumed samples: 83312 | elapsed time per iteration (ms): 14478.7 | learning rate: 2.307E-05 | global batch size: 32 | lm loss: 6.339183E+00 | loss scale: 16384.0 | grad norm: 77952.104 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4081/ 159576 | consumed samples: 83344 | elapsed time per iteration (ms): 14534.3 | learning rate: 2.308E-05 | global batch size: 32 | lm loss: 6.482682E+00 | loss scale: 16384.0 | grad norm: 78541.532 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4082/ 159576 | consumed samples: 83376 | elapsed time per iteration (ms): 14971.6 | learning rate: 2.309E-05 | global batch size: 32 | lm loss: 6.464870E+00 | loss scale: 16384.0 | grad norm: 82812.736 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4083/ 159576 | consumed samples: 83408 | elapsed time per iteration (ms): 14619.1 | learning rate: 2.310E-05 | global batch size: 32 | lm loss: 6.468065E+00 | loss scale: 16384.0 | grad norm: 95549.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4084/ 159576 | consumed samples: 83440 | elapsed time per iteration (ms): 14580.8 | learning rate: 2.311E-05 | global batch size: 32 | lm loss: 6.390970E+00 | loss scale: 16384.0 | grad norm: 76775.584 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4085/ 159576 | consumed samples: 83472 | elapsed time per iteration (ms): 14597.4 | learning rate: 2.312E-05 | global batch size: 32 | lm loss: 6.441597E+00 | loss scale: 16384.0 | grad norm: 87885.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4086/ 159576 | consumed samples: 83504 | elapsed time per iteration (ms): 14827.9 | learning rate: 2.313E-05 | global batch size: 32 | lm loss: 6.332308E+00 | loss scale: 16384.0 | grad norm: 67530.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4087/ 159576 | consumed samples: 83536 | elapsed time per iteration (ms): 14496.3 | learning rate: 2.313E-05 | global batch size: 32 | lm loss: 6.360069E+00 | loss scale: 16384.0 | grad norm: 65277.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4088/ 159576 | consumed samples: 83568 | elapsed time per iteration (ms): 14505.1 | learning rate: 2.314E-05 | global batch size: 32 | lm loss: 6.331870E+00 | loss scale: 16384.0 | grad norm: 73276.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4089/ 159576 | consumed samples: 83600 | elapsed time per iteration (ms): 14518.3 | learning rate: 2.315E-05 | global batch size: 32 | lm loss: 6.279953E+00 | loss scale: 16384.0 | grad norm: 69193.657 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4090/ 159576 | consumed samples: 83632 | elapsed time per iteration (ms): 14816.9 | learning rate: 2.316E-05 | global batch size: 32 | lm loss: 6.473932E+00 | loss scale: 16384.0 | grad norm: 78838.749 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4091/ 159576 | consumed samples: 83664 | elapsed time per iteration (ms): 14589.1 | learning rate: 2.317E-05 | global batch size: 32 | lm loss: 6.346605E+00 | loss scale: 16384.0 | grad norm: 76401.273 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4092/ 159576 | consumed samples: 83696 | elapsed time per iteration (ms): 14611.5 | learning rate: 2.318E-05 | global batch size: 32 | lm loss: 6.444325E+00 | loss scale: 16384.0 | grad norm: 85411.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4093/ 159576 | consumed samples: 83728 | elapsed time per iteration (ms): 14540.2 | learning rate: 2.319E-05 | global batch size: 32 | lm loss: 6.498468E+00 | loss scale: 16384.0 | grad norm: 97013.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4094/ 159576 | consumed samples: 83760 | elapsed time per iteration (ms): 14934.5 | learning rate: 2.320E-05 | global batch size: 32 | lm loss: 6.368524E+00 | loss scale: 16384.0 | grad norm: 75310.345 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4095/ 159576 | consumed samples: 83792 | elapsed time per iteration (ms): 14479.4 | learning rate: 2.321E-05 | global batch size: 32 | lm loss: 6.445729E+00 | loss scale: 16384.0 | grad norm: 79666.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4096/ 159576 | consumed samples: 83824 | elapsed time per iteration (ms): 14539.3 | learning rate: 2.321E-05 | global batch size: 32 | lm loss: 6.478226E+00 | loss scale: 16384.0 | grad norm: 74953.641 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4097/ 159576 | consumed samples: 83856 | elapsed time per iteration (ms): 14544.9 | learning rate: 2.322E-05 | global batch size: 32 | lm loss: 6.494800E+00 | loss scale: 16384.0 | grad norm: 83444.792 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4098/ 159576 | consumed samples: 83888 | elapsed time per iteration (ms): 14987.3 | learning rate: 2.323E-05 | global batch size: 32 | lm loss: 6.549989E+00 | loss scale: 16384.0 | grad norm: 73065.290 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4099/ 159576 | consumed samples: 83920 | elapsed time per iteration (ms): 14510.7 | learning rate: 2.324E-05 | global batch size: 32 | lm loss: 6.523539E+00 | loss scale: 16384.0 | grad norm: 83625.749 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4100/ 159576 | consumed samples: 83952 | elapsed time per iteration (ms): 14610.5 | learning rate: 2.325E-05 | global batch size: 32 | lm loss: 6.451036E+00 | loss scale: 16384.0 | grad norm: 74563.493 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4101/ 159576 | consumed samples: 83984 | elapsed time per iteration (ms): 14604.4 | learning rate: 2.326E-05 | global batch size: 32 | lm loss: 6.472479E+00 | loss scale: 16384.0 | grad norm: 109783.349 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4102/ 159576 | consumed samples: 84016 | elapsed time per iteration (ms): 14804.2 | learning rate: 2.327E-05 | global batch size: 32 | lm loss: 6.392324E+00 | loss scale: 16384.0 | grad norm: 77708.767 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4103/ 159576 | consumed samples: 84048 | elapsed time per iteration (ms): 14666.7 | learning rate: 2.328E-05 | global batch size: 32 | lm loss: 6.388014E+00 | loss scale: 16384.0 | grad norm: 72228.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4104/ 159576 | consumed samples: 84080 | elapsed time per iteration (ms): 14567.0 | learning rate: 2.329E-05 | global batch size: 32 | lm loss: 6.351237E+00 | loss scale: 16384.0 | grad norm: 75762.926 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4105/ 159576 | consumed samples: 84112 | elapsed time per iteration (ms): 14512.3 | learning rate: 2.329E-05 | global batch size: 32 | lm loss: 6.445687E+00 | loss scale: 16384.0 | grad norm: 71985.473 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4106/ 159576 | consumed samples: 84144 | elapsed time per iteration (ms): 14555.0 | learning rate: 2.330E-05 | global batch size: 32 | lm loss: 6.450569E+00 | loss scale: 16384.0 | grad norm: 70873.734 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4107/ 159576 | consumed samples: 84176 | elapsed time per iteration (ms): 14836.4 | learning rate: 2.331E-05 | global batch size: 32 | lm loss: 6.490268E+00 | loss scale: 16384.0 | grad norm: 62324.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4108/ 159576 | consumed samples: 84208 | elapsed time per iteration (ms): 14607.5 | learning rate: 2.332E-05 | global batch size: 32 | lm loss: 6.503112E+00 | loss scale: 16384.0 | grad norm: 80147.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4109/ 159576 | consumed samples: 84240 | elapsed time per iteration (ms): 14516.1 | learning rate: 2.333E-05 | global batch size: 32 | lm loss: 6.575756E+00 | loss scale: 16384.0 | grad norm: 85277.958 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4110/ 159576 | consumed samples: 84272 | elapsed time per iteration (ms): 14534.3 | learning rate: 2.334E-05 | global batch size: 32 | lm loss: 6.521991E+00 | loss scale: 16384.0 | grad norm: 88147.911 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4111/ 159576 | consumed samples: 84304 | elapsed time per iteration (ms): 14643.4 | learning rate: 2.335E-05 | global batch size: 32 | lm loss: 6.583647E+00 | loss scale: 16384.0 | grad norm: 90470.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4112/ 159576 | consumed samples: 84336 | elapsed time per iteration (ms): 14501.6 | learning rate: 2.336E-05 | global batch size: 32 | lm loss: 6.307788E+00 | loss scale: 16384.0 | grad norm: 84679.029 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4113/ 159576 | consumed samples: 84368 | elapsed time per iteration (ms): 14565.5 | learning rate: 2.337E-05 | global batch size: 32 | lm loss: 6.392709E+00 | loss scale: 16384.0 | grad norm: 85222.050 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4114/ 159576 | consumed samples: 84400 | elapsed time per iteration (ms): 14580.4 | learning rate: 2.337E-05 | global batch size: 32 | lm loss: 6.384982E+00 | loss scale: 16384.0 | grad norm: 101932.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4115/ 159576 | consumed samples: 84432 | elapsed time per iteration (ms): 14793.7 | learning rate: 2.338E-05 | global batch size: 32 | lm loss: 6.402984E+00 | loss scale: 16384.0 | grad norm: 80725.201 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4116/ 159576 | consumed samples: 84464 | elapsed time per iteration (ms): 14599.8 | learning rate: 2.339E-05 | global batch size: 32 | lm loss: 6.431032E+00 | loss scale: 16384.0 | grad norm: 88365.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4117/ 159576 | consumed samples: 84496 | elapsed time per iteration (ms): 14529.0 | learning rate: 2.340E-05 | global batch size: 32 | lm loss: 6.544386E+00 | loss scale: 16384.0 | grad norm: 94647.177 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4118/ 159576 | consumed samples: 84528 | elapsed time per iteration (ms): 14520.8 | learning rate: 2.341E-05 | global batch size: 32 | lm loss: 6.494756E+00 | loss scale: 16384.0 | grad norm: 127914.247 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4119/ 159576 | consumed samples: 84560 | elapsed time per iteration (ms): 14810.4 | learning rate: 2.342E-05 | global batch size: 32 | lm loss: 6.676927E+00 | loss scale: 16384.0 | grad norm: 255152.408 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4120/ 159576 | consumed samples: 84592 | elapsed time per iteration (ms): 14553.6 | learning rate: 2.343E-05 | global batch size: 32 | lm loss: 6.521421E+00 | loss scale: 16384.0 | grad norm: 88738.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4121/ 159576 | consumed samples: 84624 | elapsed time per iteration (ms): 14615.1 | learning rate: 2.344E-05 | global batch size: 32 | lm loss: 6.422895E+00 | loss scale: 16384.0 | grad norm: 69394.344 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4122/ 159576 | consumed samples: 84656 | elapsed time per iteration (ms): 14526.7 | learning rate: 2.345E-05 | global batch size: 32 | lm loss: 6.391778E+00 | loss scale: 16384.0 | grad norm: 75006.078 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4123/ 159576 | consumed samples: 84688 | elapsed time per iteration (ms): 14981.6 | learning rate: 2.345E-05 | global batch size: 32 | lm loss: 6.569616E+00 | loss scale: 16384.0 | grad norm: 89357.812 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4124/ 159576 | consumed samples: 84720 | elapsed time per iteration (ms): 14751.3 | learning rate: 2.346E-05 | global batch size: 32 | lm loss: 6.522147E+00 | loss scale: 16384.0 | grad norm: 83006.179 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4125/ 159576 | consumed samples: 84752 | elapsed time per iteration (ms): 14464.7 | learning rate: 2.347E-05 | global batch size: 32 | lm loss: 6.443343E+00 | loss scale: 16384.0 | grad norm: 85692.827 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4126/ 159576 | consumed samples: 84784 | elapsed time per iteration (ms): 14544.8 | learning rate: 2.348E-05 | global batch size: 32 | lm loss: 6.447396E+00 | loss scale: 16384.0 | grad norm: 75026.495 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4127/ 159576 | consumed samples: 84816 | elapsed time per iteration (ms): 14837.3 | learning rate: 2.349E-05 | global batch size: 32 | lm loss: 6.407457E+00 | loss scale: 16384.0 | grad norm: 68031.438 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4128/ 159576 | consumed samples: 84848 | elapsed time per iteration (ms): 14497.8 | learning rate: 2.350E-05 | global batch size: 32 | lm loss: 6.509037E+00 | loss scale: 16384.0 | grad norm: 81823.377 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4129/ 159576 | consumed samples: 84880 | elapsed time per iteration (ms): 14560.1 | learning rate: 2.351E-05 | global batch size: 32 | lm loss: 6.349816E+00 | loss scale: 16384.0 | grad norm: 72346.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4130/ 159576 | consumed samples: 84912 | elapsed time per iteration (ms): 14548.5 | learning rate: 2.352E-05 | global batch size: 32 | lm loss: 6.479569E+00 | loss scale: 16384.0 | grad norm: 87336.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4131/ 159576 | consumed samples: 84944 | elapsed time per iteration (ms): 14910.1 | learning rate: 2.353E-05 | global batch size: 32 | lm loss: 6.617517E+00 | loss scale: 16384.0 | grad norm: 86374.633 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4132/ 159576 | consumed samples: 84976 | elapsed time per iteration (ms): 14494.2 | learning rate: 2.353E-05 | global batch size: 32 | lm loss: 6.465295E+00 | loss scale: 16384.0 | grad norm: 84022.738 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4133/ 159576 | consumed samples: 85008 | elapsed time per iteration (ms): 14507.6 | learning rate: 2.354E-05 | global batch size: 32 | lm loss: 6.496157E+00 | loss scale: 16384.0 | grad norm: 84787.804 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4134/ 159576 | consumed samples: 85040 | elapsed time per iteration (ms): 14524.7 | learning rate: 2.355E-05 | global batch size: 32 | lm loss: 6.413724E+00 | loss scale: 16384.0 | grad norm: 85852.526 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4135/ 159576 | consumed samples: 85072 | elapsed time per iteration (ms): 14838.8 | learning rate: 2.356E-05 | global batch size: 32 | lm loss: 6.625166E+00 | loss scale: 16384.0 | grad norm: 94635.595 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4136/ 159576 | consumed samples: 85104 | elapsed time per iteration (ms): 14542.4 | learning rate: 2.357E-05 | global batch size: 32 | lm loss: 6.407034E+00 | loss scale: 16384.0 | grad norm: 84861.680 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4137/ 159576 | consumed samples: 85136 | elapsed time per iteration (ms): 14613.1 | learning rate: 2.358E-05 | global batch size: 32 | lm loss: 6.522691E+00 | loss scale: 16384.0 | grad norm: 90819.589 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4138/ 159576 | consumed samples: 85168 | elapsed time per iteration (ms): 14588.1 | learning rate: 2.359E-05 | global batch size: 32 | lm loss: 6.515704E+00 | loss scale: 16384.0 | grad norm: 84641.662 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4139/ 159576 | consumed samples: 85200 | elapsed time per iteration (ms): 14775.7 | learning rate: 2.360E-05 | global batch size: 32 | lm loss: 6.462790E+00 | loss scale: 16384.0 | grad norm: 109335.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4140/ 159576 | consumed samples: 85232 | elapsed time per iteration (ms): 14632.9 | learning rate: 2.361E-05 | global batch size: 32 | lm loss: 6.565165E+00 | loss scale: 16384.0 | grad norm: 101408.740 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4141/ 159576 | consumed samples: 85264 | elapsed time per iteration (ms): 14488.2 | learning rate: 2.361E-05 | global batch size: 32 | lm loss: 6.378877E+00 | loss scale: 16384.0 | grad norm: 85177.703 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4142/ 159576 | consumed samples: 85296 | elapsed time per iteration (ms): 14538.0 | learning rate: 2.362E-05 | global batch size: 32 | lm loss: 6.464640E+00 | loss scale: 16384.0 | grad norm: 107413.633 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4143/ 159576 | consumed samples: 85328 | elapsed time per iteration (ms): 14656.2 | learning rate: 2.363E-05 | global batch size: 32 | lm loss: 6.672103E+00 | loss scale: 16384.0 | grad norm: 79187.829 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4144/ 159576 | consumed samples: 85360 | elapsed time per iteration (ms): 14916.7 | learning rate: 2.364E-05 | global batch size: 32 | lm loss: 6.691429E+00 | loss scale: 16384.0 | grad norm: 105292.440 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4145/ 159576 | consumed samples: 85392 | elapsed time per iteration (ms): 14496.1 | learning rate: 2.365E-05 | global batch size: 32 | lm loss: 6.428411E+00 | loss scale: 16384.0 | grad norm: 81232.205 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4146/ 159576 | consumed samples: 85424 | elapsed time per iteration (ms): 14532.5 | learning rate: 2.366E-05 | global batch size: 32 | lm loss: 6.483904E+00 | loss scale: 16384.0 | grad norm: 117143.742 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4147/ 159576 | consumed samples: 85456 | elapsed time per iteration (ms): 14531.1 | learning rate: 2.367E-05 | global batch size: 32 | lm loss: 6.363456E+00 | loss scale: 16384.0 | grad norm: 88860.011 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4148/ 159576 | consumed samples: 85488 | elapsed time per iteration (ms): 14766.7 | learning rate: 2.368E-05 | global batch size: 32 | lm loss: 6.523079E+00 | loss scale: 16384.0 | grad norm: 87677.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4149/ 159576 | consumed samples: 85520 | elapsed time per iteration (ms): 14507.2 | learning rate: 2.368E-05 | global batch size: 32 | lm loss: 6.553520E+00 | loss scale: 16384.0 | grad norm: 121742.594 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4150/ 159576 | consumed samples: 85552 | elapsed time per iteration (ms): 14548.6 | learning rate: 2.369E-05 | global batch size: 32 | lm loss: 6.490498E+00 | loss scale: 16384.0 | grad norm: 89599.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4151/ 159576 | consumed samples: 85584 | elapsed time per iteration (ms): 14535.8 | learning rate: 2.370E-05 | global batch size: 32 | lm loss: 6.498284E+00 | loss scale: 16384.0 | grad norm: 103857.489 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4152/ 159576 | consumed samples: 85616 | elapsed time per iteration (ms): 14637.7 | learning rate: 2.371E-05 | global batch size: 32 | lm loss: 6.607250E+00 | loss scale: 16384.0 | grad norm: 80792.955 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4153/ 159576 | consumed samples: 85648 | elapsed time per iteration (ms): 14584.8 | learning rate: 2.372E-05 | global batch size: 32 | lm loss: 6.465719E+00 | loss scale: 16384.0 | grad norm: 76852.004 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4154/ 159576 | consumed samples: 85680 | elapsed time per iteration (ms): 14575.3 | learning rate: 2.373E-05 | global batch size: 32 | lm loss: 6.475266E+00 | loss scale: 16384.0 | grad norm: 87775.649 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4155/ 159576 | consumed samples: 85712 | elapsed time per iteration (ms): 14452.5 | learning rate: 2.374E-05 | global batch size: 32 | lm loss: 6.456027E+00 | loss scale: 16384.0 | grad norm: 75377.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4156/ 159576 | consumed samples: 85744 | elapsed time per iteration (ms): 14769.4 | learning rate: 2.375E-05 | global batch size: 32 | lm loss: 6.436621E+00 | loss scale: 16384.0 | grad norm: 86270.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4157/ 159576 | consumed samples: 85776 | elapsed time per iteration (ms): 14484.6 | learning rate: 2.376E-05 | global batch size: 32 | lm loss: 6.502521E+00 | loss scale: 16384.0 | grad norm: 77291.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4158/ 159576 | consumed samples: 85808 | elapsed time per iteration (ms): 14605.4 | learning rate: 2.376E-05 | global batch size: 32 | lm loss: 6.271915E+00 | loss scale: 16384.0 | grad norm: 79782.510 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4159/ 159576 | consumed samples: 85840 | elapsed time per iteration (ms): 14468.5 | learning rate: 2.377E-05 | global batch size: 32 | lm loss: 6.375775E+00 | loss scale: 16384.0 | grad norm: 91679.045 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4160/ 159576 | consumed samples: 85872 | elapsed time per iteration (ms): 15055.2 | learning rate: 2.378E-05 | global batch size: 32 | lm loss: 6.207356E+00 | loss scale: 16384.0 | grad norm: 84700.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4161/ 159576 | consumed samples: 85904 | elapsed time per iteration (ms): 14639.9 | learning rate: 2.379E-05 | global batch size: 32 | lm loss: 6.385208E+00 | loss scale: 16384.0 | grad norm: 77383.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4162/ 159576 | consumed samples: 85936 | elapsed time per iteration (ms): 14461.5 | learning rate: 2.380E-05 | global batch size: 32 | lm loss: 6.480938E+00 | loss scale: 16384.0 | grad norm: 98154.860 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4163/ 159576 | consumed samples: 85968 | elapsed time per iteration (ms): 14557.2 | learning rate: 2.381E-05 | global batch size: 32 | lm loss: 6.427241E+00 | loss scale: 16384.0 | grad norm: 79663.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4164/ 159576 | consumed samples: 86000 | elapsed time per iteration (ms): 15046.3 | learning rate: 2.382E-05 | global batch size: 32 | lm loss: 6.310709E+00 | loss scale: 16384.0 | grad norm: 76469.866 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4165/ 159576 | consumed samples: 86032 | elapsed time per iteration (ms): 14517.1 | learning rate: 2.383E-05 | global batch size: 32 | lm loss: 6.597423E+00 | loss scale: 16384.0 | grad norm: 95179.205 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4166/ 159576 | consumed samples: 86064 | elapsed time per iteration (ms): 14562.4 | learning rate: 2.384E-05 | global batch size: 32 | lm loss: 6.398317E+00 | loss scale: 16384.0 | grad norm: 86889.280 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4167/ 159576 | consumed samples: 86096 | elapsed time per iteration (ms): 14577.1 | learning rate: 2.384E-05 | global batch size: 32 | lm loss: 6.447660E+00 | loss scale: 16384.0 | grad norm: 99510.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4168/ 159576 | consumed samples: 86128 | elapsed time per iteration (ms): 14813.0 | learning rate: 2.385E-05 | global batch size: 32 | lm loss: 6.528482E+00 | loss scale: 16384.0 | grad norm: 83413.223 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4169/ 159576 | consumed samples: 86160 | elapsed time per iteration (ms): 14589.9 | learning rate: 2.386E-05 | global batch size: 32 | lm loss: 6.388697E+00 | loss scale: 16384.0 | grad norm: 76722.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4170/ 159576 | consumed samples: 86192 | elapsed time per iteration (ms): 14519.5 | learning rate: 2.387E-05 | global batch size: 32 | lm loss: 6.446240E+00 | loss scale: 16384.0 | grad norm: 85947.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4171/ 159576 | consumed samples: 86224 | elapsed time per iteration (ms): 14524.6 | learning rate: 2.388E-05 | global batch size: 32 | lm loss: 6.425363E+00 | loss scale: 16384.0 | grad norm: 88474.007 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4172/ 159576 | consumed samples: 86256 | elapsed time per iteration (ms): 14879.2 | learning rate: 2.389E-05 | global batch size: 32 | lm loss: 6.515138E+00 | loss scale: 16384.0 | grad norm: 108134.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4173/ 159576 | consumed samples: 86288 | elapsed time per iteration (ms): 14582.3 | learning rate: 2.390E-05 | global batch size: 32 | lm loss: 6.533965E+00 | loss scale: 16384.0 | grad norm: 76749.086 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4174/ 159576 | consumed samples: 86320 | elapsed time per iteration (ms): 14543.3 | learning rate: 2.391E-05 | global batch size: 32 | lm loss: 6.448212E+00 | loss scale: 16384.0 | grad norm: 93972.310 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4175/ 159576 | consumed samples: 86352 | elapsed time per iteration (ms): 14572.0 | learning rate: 2.392E-05 | global batch size: 32 | lm loss: 6.440217E+00 | loss scale: 16384.0 | grad norm: 102291.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4176/ 159576 | consumed samples: 86384 | elapsed time per iteration (ms): 14897.3 | learning rate: 2.392E-05 | global batch size: 32 | lm loss: 6.324600E+00 | loss scale: 16384.0 | grad norm: 81057.900 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4177/ 159576 | consumed samples: 86416 | elapsed time per iteration (ms): 14575.9 | learning rate: 2.393E-05 | global batch size: 32 | lm loss: 6.564878E+00 | loss scale: 16384.0 | grad norm: 96270.150 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4178/ 159576 | consumed samples: 86448 | elapsed time per iteration (ms): 14585.7 | learning rate: 2.394E-05 | global batch size: 32 | lm loss: 6.473108E+00 | loss scale: 16384.0 | grad norm: 80498.059 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4179/ 159576 | consumed samples: 86480 | elapsed time per iteration (ms): 14517.6 | learning rate: 2.395E-05 | global batch size: 32 | lm loss: 6.519761E+00 | loss scale: 16384.0 | grad norm: 90509.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4180/ 159576 | consumed samples: 86512 | elapsed time per iteration (ms): 14895.7 | learning rate: 2.396E-05 | global batch size: 32 | lm loss: 6.377243E+00 | loss scale: 16384.0 | grad norm: 92370.262 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4181/ 159576 | consumed samples: 86544 | elapsed time per iteration (ms): 14690.0 | learning rate: 2.397E-05 | global batch size: 32 | lm loss: 6.469300E+00 | loss scale: 16384.0 | grad norm: 89492.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4182/ 159576 | consumed samples: 86576 | elapsed time per iteration (ms): 14557.6 | learning rate: 2.398E-05 | global batch size: 32 | lm loss: 6.497668E+00 | loss scale: 16384.0 | grad norm: 104899.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4183/ 159576 | consumed samples: 86608 | elapsed time per iteration (ms): 14588.2 | learning rate: 2.399E-05 | global batch size: 32 | lm loss: 6.412446E+00 | loss scale: 16384.0 | grad norm: 81267.948 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4184/ 159576 | consumed samples: 86640 | elapsed time per iteration (ms): 14486.7 | learning rate: 2.400E-05 | global batch size: 32 | lm loss: 6.486274E+00 | loss scale: 16384.0 | grad norm: 95404.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4185/ 159576 | consumed samples: 86672 | elapsed time per iteration (ms): 14942.6 | learning rate: 2.400E-05 | global batch size: 32 | lm loss: 6.375100E+00 | loss scale: 16384.0 | grad norm: 82372.004 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4186/ 159576 | consumed samples: 86704 | elapsed time per iteration (ms): 14540.4 | learning rate: 2.401E-05 | global batch size: 32 | lm loss: 6.444688E+00 | loss scale: 16384.0 | grad norm: 102268.468 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4187/ 159576 | consumed samples: 86736 | elapsed time per iteration (ms): 14530.9 | learning rate: 2.402E-05 | global batch size: 32 | lm loss: 6.270885E+00 | loss scale: 16384.0 | grad norm: 85114.431 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4188/ 159576 | consumed samples: 86768 | elapsed time per iteration (ms): 14554.4 | learning rate: 2.403E-05 | global batch size: 32 | lm loss: 6.461191E+00 | loss scale: 16384.0 | grad norm: 82795.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4189/ 159576 | consumed samples: 86800 | elapsed time per iteration (ms): 14680.7 | learning rate: 2.404E-05 | global batch size: 32 | lm loss: 6.483377E+00 | loss scale: 16384.0 | grad norm: 106142.212 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4190/ 159576 | consumed samples: 86832 | elapsed time per iteration (ms): 14652.1 | learning rate: 2.405E-05 | global batch size: 32 | lm loss: 6.468819E+00 | loss scale: 16384.0 | grad norm: 83557.244 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4191/ 159576 | consumed samples: 86864 | elapsed time per iteration (ms): 14459.3 | learning rate: 2.406E-05 | global batch size: 32 | lm loss: 6.379012E+00 | loss scale: 16384.0 | grad norm: 90619.727 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4192/ 159576 | consumed samples: 86896 | elapsed time per iteration (ms): 14539.1 | learning rate: 2.407E-05 | global batch size: 32 | lm loss: 6.459314E+00 | loss scale: 16384.0 | grad norm: 94282.455 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4193/ 159576 | consumed samples: 86928 | elapsed time per iteration (ms): 14715.7 | learning rate: 2.408E-05 | global batch size: 32 | lm loss: 6.435170E+00 | loss scale: 16384.0 | grad norm: 92946.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4194/ 159576 | consumed samples: 86960 | elapsed time per iteration (ms): 14501.7 | learning rate: 2.408E-05 | global batch size: 32 | lm loss: 6.419791E+00 | loss scale: 16384.0 | grad norm: 78251.108 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4195/ 159576 | consumed samples: 86992 | elapsed time per iteration (ms): 14523.0 | learning rate: 2.409E-05 | global batch size: 32 | lm loss: 6.342591E+00 | loss scale: 16384.0 | grad norm: 80571.454 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4196/ 159576 | consumed samples: 87024 | elapsed time per iteration (ms): 14595.3 | learning rate: 2.410E-05 | global batch size: 32 | lm loss: 6.373145E+00 | loss scale: 16384.0 | grad norm: 106409.932 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4197/ 159576 | consumed samples: 87056 | elapsed time per iteration (ms): 14737.5 | learning rate: 2.411E-05 | global batch size: 32 | lm loss: 6.543087E+00 | loss scale: 16384.0 | grad norm: 81359.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4198/ 159576 | consumed samples: 87088 | elapsed time per iteration (ms): 14570.3 | learning rate: 2.412E-05 | global batch size: 32 | lm loss: 6.555972E+00 | loss scale: 16384.0 | grad norm: 101442.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4199/ 159576 | consumed samples: 87120 | elapsed time per iteration (ms): 14518.0 | learning rate: 2.413E-05 | global batch size: 32 | lm loss: 6.497987E+00 | loss scale: 16384.0 | grad norm: 87789.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4200/ 159576 | consumed samples: 87152 | elapsed time per iteration (ms): 14561.0 | learning rate: 2.414E-05 | global batch size: 32 | lm loss: 6.526636E+00 | loss scale: 16384.0 | grad norm: 97375.608 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4201/ 159576 | consumed samples: 87184 | elapsed time per iteration (ms): 14967.8 | learning rate: 2.415E-05 | global batch size: 32 | lm loss: 6.529594E+00 | loss scale: 16384.0 | grad norm: 98056.606 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4202/ 159576 | consumed samples: 87216 | elapsed time per iteration (ms): 14591.5 | learning rate: 2.416E-05 | global batch size: 32 | lm loss: 6.461559E+00 | loss scale: 16384.0 | grad norm: 103248.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4203/ 159576 | consumed samples: 87248 | elapsed time per iteration (ms): 14557.3 | learning rate: 2.416E-05 | global batch size: 32 | lm loss: 6.255905E+00 | loss scale: 16384.0 | grad norm: 98489.984 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4204/ 159576 | consumed samples: 87280 | elapsed time per iteration (ms): 14539.8 | learning rate: 2.417E-05 | global batch size: 32 | lm loss: 6.456792E+00 | loss scale: 16384.0 | grad norm: 90220.601 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4205/ 159576 | consumed samples: 87312 | elapsed time per iteration (ms): 14936.2 | learning rate: 2.418E-05 | global batch size: 32 | lm loss: 6.456956E+00 | loss scale: 16384.0 | grad norm: 99591.028 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4206/ 159576 | consumed samples: 87344 | elapsed time per iteration (ms): 14602.1 | learning rate: 2.419E-05 | global batch size: 32 | lm loss: 6.539675E+00 | loss scale: 16384.0 | grad norm: 106461.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4207/ 159576 | consumed samples: 87376 | elapsed time per iteration (ms): 14518.5 | learning rate: 2.420E-05 | global batch size: 32 | lm loss: 6.581583E+00 | loss scale: 16384.0 | grad norm: 104474.944 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4208/ 159576 | consumed samples: 87408 | elapsed time per iteration (ms): 14546.2 | learning rate: 2.421E-05 | global batch size: 32 | lm loss: 6.470299E+00 | loss scale: 16384.0 | grad norm: 103936.744 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4209/ 159576 | consumed samples: 87440 | elapsed time per iteration (ms): 14895.0 | learning rate: 2.422E-05 | global batch size: 32 | lm loss: 6.485046E+00 | loss scale: 16384.0 | grad norm: 103480.479 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4210/ 159576 | consumed samples: 87472 | elapsed time per iteration (ms): 14490.7 | learning rate: 2.423E-05 | global batch size: 32 | lm loss: 6.331614E+00 | loss scale: 16384.0 | grad norm: 92393.675 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4211/ 159576 | consumed samples: 87504 | elapsed time per iteration (ms): 14505.6 | learning rate: 2.424E-05 | global batch size: 32 | lm loss: 6.343493E+00 | loss scale: 16384.0 | grad norm: 138840.853 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4212/ 159576 | consumed samples: 87536 | elapsed time per iteration (ms): 14559.8 | learning rate: 2.424E-05 | global batch size: 32 | lm loss: 6.362164E+00 | loss scale: 16384.0 | grad norm: 105314.560 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4213/ 159576 | consumed samples: 87568 | elapsed time per iteration (ms): 14962.7 | learning rate: 2.425E-05 | global batch size: 32 | lm loss: 6.413978E+00 | loss scale: 16384.0 | grad norm: 100396.214 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4214/ 159576 | consumed samples: 87600 | elapsed time per iteration (ms): 14459.8 | learning rate: 2.426E-05 | global batch size: 32 | lm loss: 6.333343E+00 | loss scale: 16384.0 | grad norm: 101809.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4215/ 159576 | consumed samples: 87632 | elapsed time per iteration (ms): 14541.9 | learning rate: 2.427E-05 | global batch size: 32 | lm loss: 6.552740E+00 | loss scale: 16384.0 | grad norm: 198031.215 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4216/ 159576 | consumed samples: 87664 | elapsed time per iteration (ms): 14546.7 | learning rate: 2.428E-05 | global batch size: 32 | lm loss: 6.373903E+00 | loss scale: 16384.0 | grad norm: 98034.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4217/ 159576 | consumed samples: 87696 | elapsed time per iteration (ms): 14848.3 | learning rate: 2.429E-05 | global batch size: 32 | lm loss: 6.452424E+00 | loss scale: 16384.0 | grad norm: 267522.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4218/ 159576 | consumed samples: 87728 | elapsed time per iteration (ms): 14570.6 | learning rate: 2.430E-05 | global batch size: 32 | lm loss: 6.493920E+00 | loss scale: 16384.0 | grad norm: 121372.560 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4219/ 159576 | consumed samples: 87760 | elapsed time per iteration (ms): 14553.1 | learning rate: 2.431E-05 | global batch size: 32 | lm loss: 6.478834E+00 | loss scale: 16384.0 | grad norm: 112151.991 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4220/ 159576 | consumed samples: 87792 | elapsed time per iteration (ms): 14546.6 | learning rate: 2.432E-05 | global batch size: 32 | lm loss: 6.452081E+00 | loss scale: 16384.0 | grad norm: 164176.147 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4221/ 159576 | consumed samples: 87824 | elapsed time per iteration (ms): 14866.7 | learning rate: 2.432E-05 | global batch size: 32 | lm loss: 6.616721E+00 | loss scale: 16384.0 | grad norm: 88412.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4222/ 159576 | consumed samples: 87856 | elapsed time per iteration (ms): 14831.9 | learning rate: 2.433E-05 | global batch size: 32 | lm loss: 6.396004E+00 | loss scale: 16384.0 | grad norm: 116548.345 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4223/ 159576 | consumed samples: 87888 | elapsed time per iteration (ms): 14530.1 | learning rate: 2.434E-05 | global batch size: 32 | lm loss: 6.223457E+00 | loss scale: 16384.0 | grad norm: 151936.770 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4224/ 159576 | consumed samples: 87920 | elapsed time per iteration (ms): 14526.4 | learning rate: 2.435E-05 | global batch size: 32 | lm loss: 6.471479E+00 | loss scale: 16384.0 | grad norm: 107150.884 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4225/ 159576 | consumed samples: 87952 | elapsed time per iteration (ms): 14556.3 | learning rate: 2.436E-05 | global batch size: 32 | lm loss: 6.420123E+00 | loss scale: 16384.0 | grad norm: 118336.101 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4226/ 159576 | consumed samples: 87984 | elapsed time per iteration (ms): 14779.5 | learning rate: 2.437E-05 | global batch size: 32 | lm loss: 6.463729E+00 | loss scale: 16384.0 | grad norm: 105104.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4227/ 159576 | consumed samples: 88016 | elapsed time per iteration (ms): 14616.1 | learning rate: 2.438E-05 | global batch size: 32 | lm loss: 6.384348E+00 | loss scale: 16384.0 | grad norm: 121857.325 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4228/ 159576 | consumed samples: 88048 | elapsed time per iteration (ms): 14595.0 | learning rate: 2.439E-05 | global batch size: 32 | lm loss: 6.562186E+00 | loss scale: 16384.0 | grad norm: 120895.871 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4229/ 159576 | consumed samples: 88080 | elapsed time per iteration (ms): 14592.9 | learning rate: 2.439E-05 | global batch size: 32 | lm loss: 6.614166E+00 | loss scale: 16384.0 | grad norm: 141989.840 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4230/ 159576 | consumed samples: 88112 | elapsed time per iteration (ms): 14745.8 | learning rate: 2.440E-05 | global batch size: 32 | lm loss: 6.416856E+00 | loss scale: 16384.0 | grad norm: 135385.270 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4231/ 159576 | consumed samples: 88144 | elapsed time per iteration (ms): 14547.3 | learning rate: 2.441E-05 | global batch size: 32 | lm loss: 6.576384E+00 | loss scale: 16384.0 | grad norm: 129034.853 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4232/ 159576 | consumed samples: 88176 | elapsed time per iteration (ms): 14539.9 | learning rate: 2.442E-05 | global batch size: 32 | lm loss: 6.371499E+00 | loss scale: 16384.0 | grad norm: 102463.674 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4233/ 159576 | consumed samples: 88208 | elapsed time per iteration (ms): 14580.8 | learning rate: 2.443E-05 | global batch size: 32 | lm loss: 6.598085E+00 | loss scale: 16384.0 | grad norm: 105075.872 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4234/ 159576 | consumed samples: 88240 | elapsed time per iteration (ms): 14766.2 | learning rate: 2.444E-05 | global batch size: 32 | lm loss: 6.536204E+00 | loss scale: 16384.0 | grad norm: 109004.528 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4235/ 159576 | consumed samples: 88272 | elapsed time per iteration (ms): 14518.0 | learning rate: 2.445E-05 | global batch size: 32 | lm loss: 6.663161E+00 | loss scale: 16384.0 | grad norm: 197099.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4236/ 159576 | consumed samples: 88304 | elapsed time per iteration (ms): 14598.2 | learning rate: 2.446E-05 | global batch size: 32 | lm loss: 6.451008E+00 | loss scale: 16384.0 | grad norm: 125746.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4237/ 159576 | consumed samples: 88336 | elapsed time per iteration (ms): 14568.7 | learning rate: 2.447E-05 | global batch size: 32 | lm loss: 6.306778E+00 | loss scale: 16384.0 | grad norm: 145717.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4238/ 159576 | consumed samples: 88368 | elapsed time per iteration (ms): 14844.4 | learning rate: 2.447E-05 | global batch size: 32 | lm loss: 6.637146E+00 | loss scale: 16384.0 | grad norm: 161986.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4239/ 159576 | consumed samples: 88400 | elapsed time per iteration (ms): 14550.6 | learning rate: 2.448E-05 | global batch size: 32 | lm loss: 6.518569E+00 | loss scale: 16384.0 | grad norm: 114815.197 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4240/ 159576 | consumed samples: 88432 | elapsed time per iteration (ms): 14540.5 | learning rate: 2.449E-05 | global batch size: 32 | lm loss: 6.644086E+00 | loss scale: 16384.0 | grad norm: 127083.954 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4241/ 159576 | consumed samples: 88464 | elapsed time per iteration (ms): 14556.9 | learning rate: 2.450E-05 | global batch size: 32 | lm loss: 6.359149E+00 | loss scale: 16384.0 | grad norm: 119916.985 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4242/ 159576 | consumed samples: 88496 | elapsed time per iteration (ms): 14950.3 | learning rate: 2.451E-05 | global batch size: 32 | lm loss: 6.517668E+00 | loss scale: 16384.0 | grad norm: 116850.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4243/ 159576 | consumed samples: 88528 | elapsed time per iteration (ms): 14575.9 | learning rate: 2.452E-05 | global batch size: 32 | lm loss: 6.345152E+00 | loss scale: 16384.0 | grad norm: 106829.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4244/ 159576 | consumed samples: 88560 | elapsed time per iteration (ms): 14588.0 | learning rate: 2.453E-05 | global batch size: 32 | lm loss: 6.476923E+00 | loss scale: 16384.0 | grad norm: 121409.721 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4245/ 159576 | consumed samples: 88592 | elapsed time per iteration (ms): 14539.0 | learning rate: 2.454E-05 | global batch size: 32 | lm loss: 6.428369E+00 | loss scale: 16384.0 | grad norm: 99872.898 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4246/ 159576 | consumed samples: 88624 | elapsed time per iteration (ms): 15044.1 | learning rate: 2.455E-05 | global batch size: 32 | lm loss: 6.447415E+00 | loss scale: 16384.0 | grad norm: 102765.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4247/ 159576 | consumed samples: 88656 | elapsed time per iteration (ms): 14546.9 | learning rate: 2.455E-05 | global batch size: 32 | lm loss: 6.336578E+00 | loss scale: 16384.0 | grad norm: 90835.944 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4248/ 159576 | consumed samples: 88688 | elapsed time per iteration (ms): 14540.1 | learning rate: 2.456E-05 | global batch size: 32 | lm loss: 6.555513E+00 | loss scale: 16384.0 | grad norm: 104407.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4249/ 159576 | consumed samples: 88720 | elapsed time per iteration (ms): 14613.4 | learning rate: 2.457E-05 | global batch size: 32 | lm loss: 6.546042E+00 | loss scale: 16384.0 | grad norm: 115379.011 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4250/ 159576 | consumed samples: 88752 | elapsed time per iteration (ms): 14829.6 | learning rate: 2.458E-05 | global batch size: 32 | lm loss: 6.436588E+00 | loss scale: 16384.0 | grad norm: 107293.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4251/ 159576 | consumed samples: 88784 | elapsed time per iteration (ms): 14544.9 | learning rate: 2.459E-05 | global batch size: 32 | lm loss: 6.438442E+00 | loss scale: 16384.0 | grad norm: 105034.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4252/ 159576 | consumed samples: 88816 | elapsed time per iteration (ms): 14563.6 | learning rate: 2.460E-05 | global batch size: 32 | lm loss: 6.473608E+00 | loss scale: 16384.0 | grad norm: 84036.769 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4253/ 159576 | consumed samples: 88848 | elapsed time per iteration (ms): 14528.1 | learning rate: 2.461E-05 | global batch size: 32 | lm loss: 6.422614E+00 | loss scale: 16384.0 | grad norm: 95068.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4254/ 159576 | consumed samples: 88880 | elapsed time per iteration (ms): 14918.1 | learning rate: 2.462E-05 | global batch size: 32 | lm loss: 6.295578E+00 | loss scale: 16384.0 | grad norm: 114489.641 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4255/ 159576 | consumed samples: 88912 | elapsed time per iteration (ms): 14525.9 | learning rate: 2.463E-05 | global batch size: 32 | lm loss: 6.416272E+00 | loss scale: 16384.0 | grad norm: 91261.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4256/ 159576 | consumed samples: 88944 | elapsed time per iteration (ms): 14525.5 | learning rate: 2.463E-05 | global batch size: 32 | lm loss: 6.517479E+00 | loss scale: 32768.0 | grad norm: 94254.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4257/ 159576 | consumed samples: 88976 | elapsed time per iteration (ms): 14555.5 | learning rate: 2.464E-05 | global batch size: 32 | lm loss: 6.469455E+00 | loss scale: 32768.0 | grad norm: 174372.981 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4258/ 159576 | consumed samples: 89008 | elapsed time per iteration (ms): 14928.2 | learning rate: 2.465E-05 | global batch size: 32 | lm loss: 6.408867E+00 | loss scale: 32768.0 | grad norm: 205212.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4259/ 159576 | consumed samples: 89040 | elapsed time per iteration (ms): 14529.5 | learning rate: 2.466E-05 | global batch size: 32 | lm loss: 6.518348E+00 | loss scale: 32768.0 | grad norm: 175125.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4260/ 159576 | consumed samples: 89072 | elapsed time per iteration (ms): 14608.9 | learning rate: 2.467E-05 | global batch size: 32 | lm loss: 6.456366E+00 | loss scale: 32768.0 | grad norm: 180925.606 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4261/ 159576 | consumed samples: 89104 | elapsed time per iteration (ms): 14541.2 | learning rate: 2.468E-05 | global batch size: 32 | lm loss: 6.688640E+00 | loss scale: 32768.0 | grad norm: 205129.683 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4262/ 159576 | consumed samples: 89136 | elapsed time per iteration (ms): 14984.8 | learning rate: 2.469E-05 | global batch size: 32 | lm loss: 6.381848E+00 | loss scale: 32768.0 | grad norm: 194086.359 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4263/ 159576 | consumed samples: 89168 | elapsed time per iteration (ms): 14627.4 | learning rate: 2.470E-05 | global batch size: 32 | lm loss: 6.325251E+00 | loss scale: 32768.0 | grad norm: 200329.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4264/ 159576 | consumed samples: 89200 | elapsed time per iteration (ms): 14514.4 | learning rate: 2.471E-05 | global batch size: 32 | lm loss: 6.384187E+00 | loss scale: 32768.0 | grad norm: 206513.330 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4265/ 159576 | consumed samples: 89232 | elapsed time per iteration (ms): 14532.8 | learning rate: 2.471E-05 | global batch size: 32 | lm loss: 6.524798E+00 | loss scale: 32768.0 | grad norm: 207588.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4266/ 159576 | consumed samples: 89264 | elapsed time per iteration (ms): 14499.0 | learning rate: 2.472E-05 | global batch size: 32 | lm loss: 6.427965E+00 | loss scale: 32768.0 | grad norm: 270396.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4267/ 159576 | consumed samples: 89296 | elapsed time per iteration (ms): 14964.3 | learning rate: 2.473E-05 | global batch size: 32 | lm loss: 6.508441E+00 | loss scale: 32768.0 | grad norm: 256825.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4268/ 159576 | consumed samples: 89328 | elapsed time per iteration (ms): 14573.4 | learning rate: 2.474E-05 | global batch size: 32 | lm loss: 6.281446E+00 | loss scale: 32768.0 | grad norm: 175050.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4269/ 159576 | consumed samples: 89360 | elapsed time per iteration (ms): 14497.3 | learning rate: 2.475E-05 | global batch size: 32 | lm loss: 6.477619E+00 | loss scale: 32768.0 | grad norm: 194699.259 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4270/ 159576 | consumed samples: 89392 | elapsed time per iteration (ms): 14560.8 | learning rate: 2.476E-05 | global batch size: 32 | lm loss: 6.521669E+00 | loss scale: 32768.0 | grad norm: 204025.819 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4271/ 159576 | consumed samples: 89424 | elapsed time per iteration (ms): 14634.9 | learning rate: 2.477E-05 | global batch size: 32 | lm loss: 6.532991E+00 | loss scale: 32768.0 | grad norm: 218350.369 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4272/ 159576 | consumed samples: 89456 | elapsed time per iteration (ms): 14566.6 | learning rate: 2.478E-05 | global batch size: 32 | lm loss: 6.491451E+00 | loss scale: 32768.0 | grad norm: 196213.759 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4273/ 159576 | consumed samples: 89488 | elapsed time per iteration (ms): 14504.5 | learning rate: 2.479E-05 | global batch size: 32 | lm loss: 6.527338E+00 | loss scale: 32768.0 | grad norm: 254430.436 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4274/ 159576 | consumed samples: 89520 | elapsed time per iteration (ms): 14538.5 | learning rate: 2.479E-05 | global batch size: 32 | lm loss: 6.303001E+00 | loss scale: 32768.0 | grad norm: 189173.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4275/ 159576 | consumed samples: 89552 | elapsed time per iteration (ms): 14691.4 | learning rate: 2.480E-05 | global batch size: 32 | lm loss: 6.465518E+00 | loss scale: 32768.0 | grad norm: 266867.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4276/ 159576 | consumed samples: 89584 | elapsed time per iteration (ms): 14571.4 | learning rate: 2.481E-05 | global batch size: 32 | lm loss: 6.562708E+00 | loss scale: 32768.0 | grad norm: 213181.091 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4277/ 159576 | consumed samples: 89616 | elapsed time per iteration (ms): 14513.3 | learning rate: 2.482E-05 | global batch size: 32 | lm loss: 6.490031E+00 | loss scale: 32768.0 | grad norm: 200238.543 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4278/ 159576 | consumed samples: 89648 | elapsed time per iteration (ms): 14545.3 | learning rate: 2.483E-05 | global batch size: 32 | lm loss: 6.452188E+00 | loss scale: 32768.0 | grad norm: 209603.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4279/ 159576 | consumed samples: 89680 | elapsed time per iteration (ms): 14892.6 | learning rate: 2.484E-05 | global batch size: 32 | lm loss: 6.402837E+00 | loss scale: 32768.0 | grad norm: 213512.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4280/ 159576 | consumed samples: 89712 | elapsed time per iteration (ms): 14552.6 | learning rate: 2.485E-05 | global batch size: 32 | lm loss: 6.481530E+00 | loss scale: 32768.0 | grad norm: 218939.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4281/ 159576 | consumed samples: 89744 | elapsed time per iteration (ms): 14525.9 | learning rate: 2.486E-05 | global batch size: 32 | lm loss: 6.481557E+00 | loss scale: 32768.0 | grad norm: 211553.359 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4282/ 159576 | consumed samples: 89776 | elapsed time per iteration (ms): 14536.1 | learning rate: 2.487E-05 | global batch size: 32 | lm loss: 6.396571E+00 | loss scale: 32768.0 | grad norm: 200119.282 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4283/ 159576 | consumed samples: 89808 | elapsed time per iteration (ms): 14897.4 | learning rate: 2.487E-05 | global batch size: 32 | lm loss: 6.437448E+00 | loss scale: 32768.0 | grad norm: 211733.893 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4284/ 159576 | consumed samples: 89840 | elapsed time per iteration (ms): 14635.9 | learning rate: 2.488E-05 | global batch size: 32 | lm loss: 6.477830E+00 | loss scale: 32768.0 | grad norm: 273937.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4285/ 159576 | consumed samples: 89872 | elapsed time per iteration (ms): 14565.4 | learning rate: 2.489E-05 | global batch size: 32 | lm loss: 6.567824E+00 | loss scale: 32768.0 | grad norm: 210402.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4286/ 159576 | consumed samples: 89904 | elapsed time per iteration (ms): 14519.6 | learning rate: 2.490E-05 | global batch size: 32 | lm loss: 6.385768E+00 | loss scale: 32768.0 | grad norm: 203200.040 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4287/ 159576 | consumed samples: 89936 | elapsed time per iteration (ms): 14914.9 | learning rate: 2.491E-05 | global batch size: 32 | lm loss: 6.397992E+00 | loss scale: 32768.0 | grad norm: 182816.610 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4288/ 159576 | consumed samples: 89968 | elapsed time per iteration (ms): 14476.6 | learning rate: 2.492E-05 | global batch size: 32 | lm loss: 6.388610E+00 | loss scale: 32768.0 | grad norm: 199735.518 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4289/ 159576 | consumed samples: 90000 | elapsed time per iteration (ms): 14570.5 | learning rate: 2.493E-05 | global batch size: 32 | lm loss: 6.506209E+00 | loss scale: 32768.0 | grad norm: 206990.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4290/ 159576 | consumed samples: 90032 | elapsed time per iteration (ms): 14531.9 | learning rate: 2.494E-05 | global batch size: 32 | lm loss: 6.351604E+00 | loss scale: 32768.0 | grad norm: 204481.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4291/ 159576 | consumed samples: 90064 | elapsed time per iteration (ms): 14860.6 | learning rate: 2.495E-05 | global batch size: 32 | lm loss: 6.518882E+00 | loss scale: 32768.0 | grad norm: 236219.696 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4292/ 159576 | consumed samples: 90096 | elapsed time per iteration (ms): 14581.4 | learning rate: 2.495E-05 | global batch size: 32 | lm loss: 6.428777E+00 | loss scale: 32768.0 | grad norm: 187907.904 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4293/ 159576 | consumed samples: 90128 | elapsed time per iteration (ms): 14508.1 | learning rate: 2.496E-05 | global batch size: 32 | lm loss: 6.327142E+00 | loss scale: 32768.0 | grad norm: 204872.451 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4294/ 159576 | consumed samples: 90160 | elapsed time per iteration (ms): 14534.7 | learning rate: 2.497E-05 | global batch size: 32 | lm loss: 6.385339E+00 | loss scale: 32768.0 | grad norm: 233375.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4295/ 159576 | consumed samples: 90192 | elapsed time per iteration (ms): 14858.3 | learning rate: 2.498E-05 | global batch size: 32 | lm loss: 6.416627E+00 | loss scale: 32768.0 | grad norm: 222806.309 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4296/ 159576 | consumed samples: 90224 | elapsed time per iteration (ms): 14474.6 | learning rate: 2.499E-05 | global batch size: 32 | lm loss: 6.518059E+00 | loss scale: 32768.0 | grad norm: 226593.449 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4297/ 159576 | consumed samples: 90256 | elapsed time per iteration (ms): 14569.0 | learning rate: 2.500E-05 | global batch size: 32 | lm loss: 6.133147E+00 | loss scale: 32768.0 | grad norm: 267419.394 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4298/ 159576 | consumed samples: 90288 | elapsed time per iteration (ms): 14566.4 | learning rate: 2.501E-05 | global batch size: 32 | lm loss: 6.308548E+00 | loss scale: 32768.0 | grad norm: 204598.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4299/ 159576 | consumed samples: 90320 | elapsed time per iteration (ms): 14984.7 | learning rate: 2.502E-05 | global batch size: 32 | lm loss: 6.369866E+00 | loss scale: 32768.0 | grad norm: 221545.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4300/ 159576 | consumed samples: 90352 | elapsed time per iteration (ms): 14484.6 | learning rate: 2.503E-05 | global batch size: 32 | lm loss: 6.530766E+00 | loss scale: 32768.0 | grad norm: 267800.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4301/ 159576 | consumed samples: 90384 | elapsed time per iteration (ms): 14557.5 | learning rate: 2.503E-05 | global batch size: 32 | lm loss: 6.503004E+00 | loss scale: 32768.0 | grad norm: 228461.361 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4302/ 159576 | consumed samples: 90416 | elapsed time per iteration (ms): 14550.0 | learning rate: 2.504E-05 | global batch size: 32 | lm loss: 6.538440E+00 | loss scale: 32768.0 | grad norm: 190026.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4303/ 159576 | consumed samples: 90448 | elapsed time per iteration (ms): 14655.7 | learning rate: 2.505E-05 | global batch size: 32 | lm loss: 6.461242E+00 | loss scale: 32768.0 | grad norm: 211257.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4304/ 159576 | consumed samples: 90480 | elapsed time per iteration (ms): 14769.1 | learning rate: 2.506E-05 | global batch size: 32 | lm loss: 6.479248E+00 | loss scale: 32768.0 | grad norm: 198712.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4305/ 159576 | consumed samples: 90512 | elapsed time per iteration (ms): 14577.3 | learning rate: 2.507E-05 | global batch size: 32 | lm loss: 6.432651E+00 | loss scale: 32768.0 | grad norm: 206822.372 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4306/ 159576 | consumed samples: 90544 | elapsed time per iteration (ms): 14533.2 | learning rate: 2.508E-05 | global batch size: 32 | lm loss: 6.347961E+00 | loss scale: 32768.0 | grad norm: 195748.989 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4307/ 159576 | consumed samples: 90576 | elapsed time per iteration (ms): 14563.8 | learning rate: 2.509E-05 | global batch size: 32 | lm loss: 6.507642E+00 | loss scale: 32768.0 | grad norm: 218663.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4308/ 159576 | consumed samples: 90608 | elapsed time per iteration (ms): 14732.7 | learning rate: 2.510E-05 | global batch size: 32 | lm loss: 6.541059E+00 | loss scale: 32768.0 | grad norm: 228970.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4309/ 159576 | consumed samples: 90640 | elapsed time per iteration (ms): 14469.9 | learning rate: 2.511E-05 | global batch size: 32 | lm loss: 6.424891E+00 | loss scale: 32768.0 | grad norm: 196198.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4310/ 159576 | consumed samples: 90672 | elapsed time per iteration (ms): 14508.3 | learning rate: 2.511E-05 | global batch size: 32 | lm loss: 6.490376E+00 | loss scale: 32768.0 | grad norm: 215960.903 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4311/ 159576 | consumed samples: 90704 | elapsed time per iteration (ms): 14508.3 | learning rate: 2.512E-05 | global batch size: 32 | lm loss: 6.488754E+00 | loss scale: 32768.0 | grad norm: 195374.466 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4312/ 159576 | consumed samples: 90736 | elapsed time per iteration (ms): 14753.9 | learning rate: 2.513E-05 | global batch size: 32 | lm loss: 6.448671E+00 | loss scale: 32768.0 | grad norm: 227732.025 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4313/ 159576 | consumed samples: 90768 | elapsed time per iteration (ms): 14571.8 | learning rate: 2.514E-05 | global batch size: 32 | lm loss: 6.500753E+00 | loss scale: 32768.0 | grad norm: 266264.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4314/ 159576 | consumed samples: 90800 | elapsed time per iteration (ms): 14601.7 | learning rate: 2.515E-05 | global batch size: 32 | lm loss: 6.454448E+00 | loss scale: 32768.0 | grad norm: 224312.848 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4315/ 159576 | consumed samples: 90832 | elapsed time per iteration (ms): 14520.9 | learning rate: 2.516E-05 | global batch size: 32 | lm loss: 6.340928E+00 | loss scale: 32768.0 | grad norm: 252168.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4316/ 159576 | consumed samples: 90864 | elapsed time per iteration (ms): 14650.6 | learning rate: 2.517E-05 | global batch size: 32 | lm loss: 6.524774E+00 | loss scale: 32768.0 | grad norm: 233060.511 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4317/ 159576 | consumed samples: 90896 | elapsed time per iteration (ms): 14507.8 | learning rate: 2.518E-05 | global batch size: 32 | lm loss: 6.526123E+00 | loss scale: 32768.0 | grad norm: 228145.157 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4318/ 159576 | consumed samples: 90928 | elapsed time per iteration (ms): 14505.6 | learning rate: 2.518E-05 | global batch size: 32 | lm loss: 6.554380E+00 | loss scale: 32768.0 | grad norm: 215247.212 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 19:07:09] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 19:07:09] PULSE: tr8-104B is running for 13:14:58 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 4319/ 159576 | consumed samples: 90960 | elapsed time per iteration (ms): 14496.4 | learning rate: 2.519E-05 | global batch size: 32 | lm loss: 6.312326E+00 | loss scale: 32768.0 | grad norm: 214751.055 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4320/ 159576 | consumed samples: 90992 | elapsed time per iteration (ms): 14941.6 | learning rate: 2.520E-05 | global batch size: 32 | lm loss: 6.452510E+00 | loss scale: 32768.0 | grad norm: 260142.714 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4321/ 159576 | consumed samples: 91024 | elapsed time per iteration (ms): 14618.7 | learning rate: 2.521E-05 | global batch size: 32 | lm loss: 6.420647E+00 | loss scale: 32768.0 | grad norm: 225655.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4322/ 159576 | consumed samples: 91056 | elapsed time per iteration (ms): 14566.6 | learning rate: 2.522E-05 | global batch size: 32 | lm loss: 6.402806E+00 | loss scale: 32768.0 | grad norm: 291928.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4323/ 159576 | consumed samples: 91088 | elapsed time per iteration (ms): 14498.7 | learning rate: 2.523E-05 | global batch size: 32 | lm loss: 6.391022E+00 | loss scale: 32768.0 | grad norm: 237551.777 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4324/ 159576 | consumed samples: 91120 | elapsed time per iteration (ms): 15211.7 | learning rate: 2.524E-05 | global batch size: 32 | lm loss: 6.430393E+00 | loss scale: 32768.0 | grad norm: 234733.593 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4325/ 159576 | consumed samples: 91152 | elapsed time per iteration (ms): 14439.1 | learning rate: 2.525E-05 | global batch size: 32 | lm loss: 6.406878E+00 | loss scale: 32768.0 | grad norm: 212091.318 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4326/ 159576 | consumed samples: 91184 | elapsed time per iteration (ms): 14533.1 | learning rate: 2.526E-05 | global batch size: 32 | lm loss: 6.439167E+00 | loss scale: 32768.0 | grad norm: 244000.757 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4327/ 159576 | consumed samples: 91216 | elapsed time per iteration (ms): 14508.9 | learning rate: 2.526E-05 | global batch size: 32 | lm loss: 6.334565E+00 | loss scale: 32768.0 | grad norm: 183767.589 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4328/ 159576 | consumed samples: 91248 | elapsed time per iteration (ms): 14921.5 | learning rate: 2.527E-05 | global batch size: 32 | lm loss: 6.456017E+00 | loss scale: 32768.0 | grad norm: 239736.759 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4329/ 159576 | consumed samples: 91280 | elapsed time per iteration (ms): 14572.2 | learning rate: 2.528E-05 | global batch size: 32 | lm loss: 6.367092E+00 | loss scale: 32768.0 | grad norm: 195126.741 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4330/ 159576 | consumed samples: 91312 | elapsed time per iteration (ms): 14531.1 | learning rate: 2.529E-05 | global batch size: 32 | lm loss: 6.383262E+00 | loss scale: 32768.0 | grad norm: 208256.244 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4331/ 159576 | consumed samples: 91344 | elapsed time per iteration (ms): 14591.9 | learning rate: 2.530E-05 | global batch size: 32 | lm loss: 6.502596E+00 | loss scale: 32768.0 | grad norm: 248824.057 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4332/ 159576 | consumed samples: 91376 | elapsed time per iteration (ms): 14794.2 | learning rate: 2.531E-05 | global batch size: 32 | lm loss: 6.386366E+00 | loss scale: 32768.0 | grad norm: 223413.013 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4333/ 159576 | consumed samples: 91408 | elapsed time per iteration (ms): 14447.8 | learning rate: 2.532E-05 | global batch size: 32 | lm loss: 6.470964E+00 | loss scale: 32768.0 | grad norm: 220869.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4334/ 159576 | consumed samples: 91440 | elapsed time per iteration (ms): 14523.5 | learning rate: 2.533E-05 | global batch size: 32 | lm loss: 6.423388E+00 | loss scale: 32768.0 | grad norm: 204896.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4335/ 159576 | consumed samples: 91472 | elapsed time per iteration (ms): 14548.8 | learning rate: 2.534E-05 | global batch size: 32 | lm loss: 6.516037E+00 | loss scale: 32768.0 | grad norm: 214455.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4336/ 159576 | consumed samples: 91504 | elapsed time per iteration (ms): 14925.7 | learning rate: 2.534E-05 | global batch size: 32 | lm loss: 6.420337E+00 | loss scale: 32768.0 | grad norm: 252272.858 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4337/ 159576 | consumed samples: 91536 | elapsed time per iteration (ms): 14576.6 | learning rate: 2.535E-05 | global batch size: 32 | lm loss: 6.464952E+00 | loss scale: 32768.0 | grad norm: 193893.530 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4338/ 159576 | consumed samples: 91568 | elapsed time per iteration (ms): 14502.1 | learning rate: 2.536E-05 | global batch size: 32 | lm loss: 6.492158E+00 | loss scale: 32768.0 | grad norm: 243709.356 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4339/ 159576 | consumed samples: 91600 | elapsed time per iteration (ms): 14503.5 | learning rate: 2.537E-05 | global batch size: 32 | lm loss: 6.239275E+00 | loss scale: 32768.0 | grad norm: 206242.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4340/ 159576 | consumed samples: 91632 | elapsed time per iteration (ms): 14881.4 | learning rate: 2.538E-05 | global batch size: 32 | lm loss: 6.484446E+00 | loss scale: 32768.0 | grad norm: 213552.367 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4341/ 159576 | consumed samples: 91664 | elapsed time per iteration (ms): 14651.1 | learning rate: 2.539E-05 | global batch size: 32 | lm loss: 6.419237E+00 | loss scale: 32768.0 | grad norm: 210520.111 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4342/ 159576 | consumed samples: 91696 | elapsed time per iteration (ms): 14512.3 | learning rate: 2.540E-05 | global batch size: 32 | lm loss: 6.452721E+00 | loss scale: 32768.0 | grad norm: 238634.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4343/ 159576 | consumed samples: 91728 | elapsed time per iteration (ms): 14558.7 | learning rate: 2.541E-05 | global batch size: 32 | lm loss: 6.347074E+00 | loss scale: 32768.0 | grad norm: 202447.417 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4344/ 159576 | consumed samples: 91760 | elapsed time per iteration (ms): 14594.4 | learning rate: 2.542E-05 | global batch size: 32 | lm loss: 6.520543E+00 | loss scale: 32768.0 | grad norm: 239073.554 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4345/ 159576 | consumed samples: 91792 | elapsed time per iteration (ms): 14908.5 | learning rate: 2.542E-05 | global batch size: 32 | lm loss: 6.421722E+00 | loss scale: 32768.0 | grad norm: 217284.913 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4346/ 159576 | consumed samples: 91824 | elapsed time per iteration (ms): 14533.0 | learning rate: 2.543E-05 | global batch size: 32 | lm loss: 6.272108E+00 | loss scale: 32768.0 | grad norm: 200271.872 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4347/ 159576 | consumed samples: 91856 | elapsed time per iteration (ms): 14569.7 | learning rate: 2.544E-05 | global batch size: 32 | lm loss: 6.532617E+00 | loss scale: 32768.0 | grad norm: 194761.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4348/ 159576 | consumed samples: 91888 | elapsed time per iteration (ms): 14475.9 | learning rate: 2.545E-05 | global batch size: 32 | lm loss: 6.471928E+00 | loss scale: 32768.0 | grad norm: 217213.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4349/ 159576 | consumed samples: 91920 | elapsed time per iteration (ms): 14760.6 | learning rate: 2.546E-05 | global batch size: 32 | lm loss: 6.416161E+00 | loss scale: 32768.0 | grad norm: 224313.842 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4350/ 159576 | consumed samples: 91952 | elapsed time per iteration (ms): 14554.3 | learning rate: 2.547E-05 | global batch size: 32 | lm loss: 6.550965E+00 | loss scale: 32768.0 | grad norm: 241887.267 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4351/ 159576 | consumed samples: 91984 | elapsed time per iteration (ms): 14563.9 | learning rate: 2.548E-05 | global batch size: 32 | lm loss: 6.496109E+00 | loss scale: 32768.0 | grad norm: 216683.843 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4352/ 159576 | consumed samples: 92016 | elapsed time per iteration (ms): 14514.3 | learning rate: 2.549E-05 | global batch size: 32 | lm loss: 6.359037E+00 | loss scale: 32768.0 | grad norm: 205500.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4353/ 159576 | consumed samples: 92048 | elapsed time per iteration (ms): 14703.1 | learning rate: 2.550E-05 | global batch size: 32 | lm loss: 6.333501E+00 | loss scale: 32768.0 | grad norm: 326501.197 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4354/ 159576 | consumed samples: 92080 | elapsed time per iteration (ms): 14558.2 | learning rate: 2.550E-05 | global batch size: 32 | lm loss: 6.455669E+00 | loss scale: 32768.0 | grad norm: 254904.658 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4355/ 159576 | consumed samples: 92112 | elapsed time per iteration (ms): 14511.5 | learning rate: 2.551E-05 | global batch size: 32 | lm loss: 6.509322E+00 | loss scale: 32768.0 | grad norm: 237041.501 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4356/ 159576 | consumed samples: 92144 | elapsed time per iteration (ms): 14539.0 | learning rate: 2.552E-05 | global batch size: 32 | lm loss: 6.356802E+00 | loss scale: 32768.0 | grad norm: 268871.419 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4357/ 159576 | consumed samples: 92176 | elapsed time per iteration (ms): 14822.4 | learning rate: 2.553E-05 | global batch size: 32 | lm loss: 6.599571E+00 | loss scale: 32768.0 | grad norm: 283473.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4358/ 159576 | consumed samples: 92208 | elapsed time per iteration (ms): 14612.7 | learning rate: 2.554E-05 | global batch size: 32 | lm loss: 6.308304E+00 | loss scale: 32768.0 | grad norm: 231784.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4359/ 159576 | consumed samples: 92240 | elapsed time per iteration (ms): 14524.9 | learning rate: 2.555E-05 | global batch size: 32 | lm loss: 6.395612E+00 | loss scale: 32768.0 | grad norm: 270045.717 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4360/ 159576 | consumed samples: 92272 | elapsed time per iteration (ms): 14601.7 | learning rate: 2.556E-05 | global batch size: 32 | lm loss: 6.525626E+00 | loss scale: 32768.0 | grad norm: 275256.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4361/ 159576 | consumed samples: 92304 | elapsed time per iteration (ms): 14951.2 | learning rate: 2.557E-05 | global batch size: 32 | lm loss: 6.457727E+00 | loss scale: 32768.0 | grad norm: 277346.905 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4362/ 159576 | consumed samples: 92336 | elapsed time per iteration (ms): 14507.2 | learning rate: 2.558E-05 | global batch size: 32 | lm loss: 6.423290E+00 | loss scale: 32768.0 | grad norm: 259149.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4363/ 159576 | consumed samples: 92368 | elapsed time per iteration (ms): 14519.9 | learning rate: 2.558E-05 | global batch size: 32 | lm loss: 6.385529E+00 | loss scale: 32768.0 | grad norm: 288729.160 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4364/ 159576 | consumed samples: 92400 | elapsed time per iteration (ms): 14590.0 | learning rate: 2.559E-05 | global batch size: 32 | lm loss: 6.344237E+00 | loss scale: 32768.0 | grad norm: 224867.437 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4365/ 159576 | consumed samples: 92432 | elapsed time per iteration (ms): 15022.1 | learning rate: 2.560E-05 | global batch size: 32 | lm loss: 6.361878E+00 | loss scale: 32768.0 | grad norm: 317761.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4366/ 159576 | consumed samples: 92464 | elapsed time per iteration (ms): 14751.4 | learning rate: 2.561E-05 | global batch size: 32 | lm loss: 6.330537E+00 | loss scale: 32768.0 | grad norm: 265015.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4367/ 159576 | consumed samples: 92496 | elapsed time per iteration (ms): 14614.0 | learning rate: 2.562E-05 | global batch size: 32 | lm loss: 6.148376E+00 | loss scale: 32768.0 | grad norm: 264202.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4368/ 159576 | consumed samples: 92528 | elapsed time per iteration (ms): 14584.5 | learning rate: 2.563E-05 | global batch size: 32 | lm loss: 6.479382E+00 | loss scale: 32768.0 | grad norm: 264375.223 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4369/ 159576 | consumed samples: 92560 | elapsed time per iteration (ms): 14918.5 | learning rate: 2.564E-05 | global batch size: 32 | lm loss: 6.363014E+00 | loss scale: 32768.0 | grad norm: 226102.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4370/ 159576 | consumed samples: 92592 | elapsed time per iteration (ms): 14489.4 | learning rate: 2.565E-05 | global batch size: 32 | lm loss: 6.437625E+00 | loss scale: 32768.0 | grad norm: 280139.331 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4371/ 159576 | consumed samples: 92624 | elapsed time per iteration (ms): 14515.3 | learning rate: 2.566E-05 | global batch size: 32 | lm loss: 6.394330E+00 | loss scale: 32768.0 | grad norm: 290041.946 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4372/ 159576 | consumed samples: 92656 | elapsed time per iteration (ms): 14519.6 | learning rate: 2.566E-05 | global batch size: 32 | lm loss: 6.430163E+00 | loss scale: 32768.0 | grad norm: 318528.997 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4373/ 159576 | consumed samples: 92688 | elapsed time per iteration (ms): 14816.9 | learning rate: 2.567E-05 | global batch size: 32 | lm loss: 6.494810E+00 | loss scale: 32768.0 | grad norm: 279939.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4374/ 159576 | consumed samples: 92720 | elapsed time per iteration (ms): 14615.4 | learning rate: 2.568E-05 | global batch size: 32 | lm loss: 6.431265E+00 | loss scale: 32768.0 | grad norm: 260943.403 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4375/ 159576 | consumed samples: 92752 | elapsed time per iteration (ms): 14539.2 | learning rate: 2.569E-05 | global batch size: 32 | lm loss: 6.365846E+00 | loss scale: 32768.0 | grad norm: 614516.527 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4376/ 159576 | consumed samples: 92784 | elapsed time per iteration (ms): 14560.9 | learning rate: 2.570E-05 | global batch size: 32 | lm loss: 6.306572E+00 | loss scale: 32768.0 | grad norm: 303539.975 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4377/ 159576 | consumed samples: 92816 | elapsed time per iteration (ms): 14894.6 | learning rate: 2.571E-05 | global batch size: 32 | lm loss: 6.444806E+00 | loss scale: 32768.0 | grad norm: 305405.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4378/ 159576 | consumed samples: 92848 | elapsed time per iteration (ms): 14498.0 | learning rate: 2.572E-05 | global batch size: 32 | lm loss: 6.475850E+00 | loss scale: 32768.0 | grad norm: 302245.775 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4379/ 159576 | consumed samples: 92880 | elapsed time per iteration (ms): 14519.5 | learning rate: 2.573E-05 | global batch size: 32 | lm loss: 6.470803E+00 | loss scale: 32768.0 | grad norm: 302163.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4380/ 159576 | consumed samples: 92912 | elapsed time per iteration (ms): 14547.1 | learning rate: 2.574E-05 | global batch size: 32 | lm loss: 6.285831E+00 | loss scale: 32768.0 | grad norm: 245533.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4381/ 159576 | consumed samples: 92944 | elapsed time per iteration (ms): 14903.6 | learning rate: 2.574E-05 | global batch size: 32 | lm loss: 6.382543E+00 | loss scale: 32768.0 | grad norm: 256847.499 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4382/ 159576 | consumed samples: 92976 | elapsed time per iteration (ms): 14746.3 | learning rate: 2.575E-05 | global batch size: 32 | lm loss: 6.377112E+00 | loss scale: 32768.0 | grad norm: 234822.067 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4383/ 159576 | consumed samples: 93008 | elapsed time per iteration (ms): 14580.0 | learning rate: 2.576E-05 | global batch size: 32 | lm loss: 6.412641E+00 | loss scale: 32768.0 | grad norm: 343040.768 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4384/ 159576 | consumed samples: 93040 | elapsed time per iteration (ms): 14506.7 | learning rate: 2.577E-05 | global batch size: 32 | lm loss: 6.416348E+00 | loss scale: 32768.0 | grad norm: 291818.464 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4385/ 159576 | consumed samples: 93072 | elapsed time per iteration (ms): 14512.2 | learning rate: 2.578E-05 | global batch size: 32 | lm loss: 6.425752E+00 | loss scale: 32768.0 | grad norm: 323662.796 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4386/ 159576 | consumed samples: 93104 | elapsed time per iteration (ms): 14928.6 | learning rate: 2.579E-05 | global batch size: 32 | lm loss: 6.318911E+00 | loss scale: 32768.0 | grad norm: 305616.292 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4387/ 159576 | consumed samples: 93136 | elapsed time per iteration (ms): 14506.3 | learning rate: 2.580E-05 | global batch size: 32 | lm loss: 6.531947E+00 | loss scale: 32768.0 | grad norm: 350201.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4388/ 159576 | consumed samples: 93168 | elapsed time per iteration (ms): 14556.8 | learning rate: 2.581E-05 | global batch size: 32 | lm loss: 6.376329E+00 | loss scale: 32768.0 | grad norm: 345044.523 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4389/ 159576 | consumed samples: 93200 | elapsed time per iteration (ms): 14537.0 | learning rate: 2.582E-05 | global batch size: 32 | lm loss: 6.381351E+00 | loss scale: 32768.0 | grad norm: 285108.825 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4390/ 159576 | consumed samples: 93232 | elapsed time per iteration (ms): 14792.9 | learning rate: 2.582E-05 | global batch size: 32 | lm loss: 6.367733E+00 | loss scale: 32768.0 | grad norm: 443607.853 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4391/ 159576 | consumed samples: 93264 | elapsed time per iteration (ms): 14536.7 | learning rate: 2.583E-05 | global batch size: 32 | lm loss: 6.404822E+00 | loss scale: 32768.0 | grad norm: 266018.610 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4392/ 159576 | consumed samples: 93296 | elapsed time per iteration (ms): 14465.3 | learning rate: 2.584E-05 | global batch size: 32 | lm loss: 6.460493E+00 | loss scale: 32768.0 | grad norm: 388305.684 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4393/ 159576 | consumed samples: 93328 | elapsed time per iteration (ms): 14549.7 | learning rate: 2.585E-05 | global batch size: 32 | lm loss: 6.312160E+00 | loss scale: 32768.0 | grad norm: 289444.907 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4394/ 159576 | consumed samples: 93360 | elapsed time per iteration (ms): 14712.4 | learning rate: 2.586E-05 | global batch size: 32 | lm loss: 6.447091E+00 | loss scale: 32768.0 | grad norm: 310866.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4395/ 159576 | consumed samples: 93392 | elapsed time per iteration (ms): 14507.9 | learning rate: 2.587E-05 | global batch size: 32 | lm loss: 6.358830E+00 | loss scale: 32768.0 | grad norm: 254147.069 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4396/ 159576 | consumed samples: 93424 | elapsed time per iteration (ms): 14549.6 | learning rate: 2.588E-05 | global batch size: 32 | lm loss: 6.406147E+00 | loss scale: 32768.0 | grad norm: 368220.982 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4397/ 159576 | consumed samples: 93456 | elapsed time per iteration (ms): 14535.1 | learning rate: 2.589E-05 | global batch size: 32 | lm loss: 6.511951E+00 | loss scale: 32768.0 | grad norm: 306021.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4398/ 159576 | consumed samples: 93488 | elapsed time per iteration (ms): 14834.9 | learning rate: 2.589E-05 | global batch size: 32 | lm loss: 6.344939E+00 | loss scale: 32768.0 | grad norm: 244440.220 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4399/ 159576 | consumed samples: 93520 | elapsed time per iteration (ms): 14561.9 | learning rate: 2.590E-05 | global batch size: 32 | lm loss: 6.408576E+00 | loss scale: 32768.0 | grad norm: 331789.025 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4400/ 159576 | consumed samples: 93552 | elapsed time per iteration (ms): 14527.0 | learning rate: 2.591E-05 | global batch size: 32 | lm loss: 6.405599E+00 | loss scale: 32768.0 | grad norm: 389927.053 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4401/ 159576 | consumed samples: 93584 | elapsed time per iteration (ms): 14530.9 | learning rate: 2.592E-05 | global batch size: 32 | lm loss: 6.461980E+00 | loss scale: 32768.0 | grad norm: 344518.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4402/ 159576 | consumed samples: 93616 | elapsed time per iteration (ms): 15042.1 | learning rate: 2.593E-05 | global batch size: 32 | lm loss: 6.416601E+00 | loss scale: 32768.0 | grad norm: 310590.140 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4403/ 159576 | consumed samples: 93648 | elapsed time per iteration (ms): 14634.8 | learning rate: 2.594E-05 | global batch size: 32 | lm loss: 6.546180E+00 | loss scale: 32768.0 | grad norm: 267385.444 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4404/ 159576 | consumed samples: 93680 | elapsed time per iteration (ms): 14549.2 | learning rate: 2.595E-05 | global batch size: 32 | lm loss: 6.399436E+00 | loss scale: 32768.0 | grad norm: 298662.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4405/ 159576 | consumed samples: 93712 | elapsed time per iteration (ms): 14489.5 | learning rate: 2.596E-05 | global batch size: 32 | lm loss: 6.306044E+00 | loss scale: 32768.0 | grad norm: 302499.736 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4406/ 159576 | consumed samples: 93744 | elapsed time per iteration (ms): 14963.1 | learning rate: 2.597E-05 | global batch size: 32 | lm loss: 6.504598E+00 | loss scale: 32768.0 | grad norm: 315577.594 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4407/ 159576 | consumed samples: 93776 | elapsed time per iteration (ms): 14516.0 | learning rate: 2.597E-05 | global batch size: 32 | lm loss: 6.229925E+00 | loss scale: 32768.0 | grad norm: 238182.668 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4408/ 159576 | consumed samples: 93808 | elapsed time per iteration (ms): 14496.6 | learning rate: 2.598E-05 | global batch size: 32 | lm loss: 6.414362E+00 | loss scale: 32768.0 | grad norm: 274509.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4409/ 159576 | consumed samples: 93840 | elapsed time per iteration (ms): 14543.5 | learning rate: 2.599E-05 | global batch size: 32 | lm loss: 6.355350E+00 | loss scale: 32768.0 | grad norm: 288329.828 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4410/ 159576 | consumed samples: 93872 | elapsed time per iteration (ms): 14875.5 | learning rate: 2.600E-05 | global batch size: 32 | lm loss: 6.366935E+00 | loss scale: 32768.0 | grad norm: 252983.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4411/ 159576 | consumed samples: 93904 | elapsed time per iteration (ms): 14456.2 | learning rate: 2.601E-05 | global batch size: 32 | lm loss: 6.458515E+00 | loss scale: 32768.0 | grad norm: 210575.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4412/ 159576 | consumed samples: 93936 | elapsed time per iteration (ms): 14560.7 | learning rate: 2.602E-05 | global batch size: 32 | lm loss: 6.472146E+00 | loss scale: 32768.0 | grad norm: 237114.094 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4413/ 159576 | consumed samples: 93968 | elapsed time per iteration (ms): 14587.5 | learning rate: 2.603E-05 | global batch size: 32 | lm loss: 6.359771E+00 | loss scale: 32768.0 | grad norm: 252911.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4414/ 159576 | consumed samples: 94000 | elapsed time per iteration (ms): 14804.6 | learning rate: 2.604E-05 | global batch size: 32 | lm loss: 6.563889E+00 | loss scale: 32768.0 | grad norm: 296794.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4415/ 159576 | consumed samples: 94032 | elapsed time per iteration (ms): 14512.9 | learning rate: 2.605E-05 | global batch size: 32 | lm loss: 6.413787E+00 | loss scale: 32768.0 | grad norm: 272034.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4416/ 159576 | consumed samples: 94064 | elapsed time per iteration (ms): 14494.5 | learning rate: 2.605E-05 | global batch size: 32 | lm loss: 6.443899E+00 | loss scale: 32768.0 | grad norm: 290284.950 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4417/ 159576 | consumed samples: 94096 | elapsed time per iteration (ms): 14536.8 | learning rate: 2.606E-05 | global batch size: 32 | lm loss: 6.472334E+00 | loss scale: 32768.0 | grad norm: 248961.089 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4418/ 159576 | consumed samples: 94128 | elapsed time per iteration (ms): 14975.6 | learning rate: 2.607E-05 | global batch size: 32 | lm loss: 6.557878E+00 | loss scale: 32768.0 | grad norm: 330814.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4419/ 159576 | consumed samples: 94160 | elapsed time per iteration (ms): 14477.8 | learning rate: 2.608E-05 | global batch size: 32 | lm loss: 6.499488E+00 | loss scale: 32768.0 | grad norm: 268804.004 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4420/ 159576 | consumed samples: 94192 | elapsed time per iteration (ms): 14628.8 | learning rate: 2.609E-05 | global batch size: 32 | lm loss: 6.312944E+00 | loss scale: 32768.0 | grad norm: 264253.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4421/ 159576 | consumed samples: 94224 | elapsed time per iteration (ms): 14519.9 | learning rate: 2.610E-05 | global batch size: 32 | lm loss: 6.392362E+00 | loss scale: 32768.0 | grad norm: 255470.733 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4422/ 159576 | consumed samples: 94256 | elapsed time per iteration (ms): 14805.5 | learning rate: 2.611E-05 | global batch size: 32 | lm loss: 6.375703E+00 | loss scale: 32768.0 | grad norm: 246267.346 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4423/ 159576 | consumed samples: 94288 | elapsed time per iteration (ms): 14680.3 | learning rate: 2.612E-05 | global batch size: 32 | lm loss: 6.523773E+00 | loss scale: 32768.0 | grad norm: 281090.751 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4424/ 159576 | consumed samples: 94320 | elapsed time per iteration (ms): 7706.4 | learning rate: 2.612E-05 | global batch size: 32 | lm loss: 6.355268E+00 | loss scale: 32768.0 | grad norm: 281090.751 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4425/ 159576 | consumed samples: 94352 | elapsed time per iteration (ms): 13992.5 | learning rate: 2.613E-05 | global batch size: 32 | lm loss: 6.391113E+00 | loss scale: 32768.0 | grad norm: 235806.214 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4426/ 159576 | consumed samples: 94384 | elapsed time per iteration (ms): 14643.4 | learning rate: 2.613E-05 | global batch size: 32 | lm loss: 6.483145E+00 | loss scale: 32768.0 | grad norm: 316001.533 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4427/ 159576 | consumed samples: 94416 | elapsed time per iteration (ms): 14931.0 | learning rate: 2.614E-05 | global batch size: 32 | lm loss: 6.419625E+00 | loss scale: 32768.0 | grad norm: 595148.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4428/ 159576 | consumed samples: 94448 | elapsed time per iteration (ms): 14542.3 | learning rate: 2.615E-05 | global batch size: 32 | lm loss: 6.463273E+00 | loss scale: 32768.0 | grad norm: 310708.077 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4429/ 159576 | consumed samples: 94480 | elapsed time per iteration (ms): 14522.5 | learning rate: 2.616E-05 | global batch size: 32 | lm loss: 6.427548E+00 | loss scale: 32768.0 | grad norm: 324018.149 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4430/ 159576 | consumed samples: 94512 | elapsed time per iteration (ms): 14489.9 | learning rate: 2.617E-05 | global batch size: 32 | lm loss: 6.385033E+00 | loss scale: 32768.0 | grad norm: 244981.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4431/ 159576 | consumed samples: 94560 | elapsed time per iteration (ms): 15763.7 | learning rate: 2.618E-05 | global batch size: 48 | lm loss: 6.545300E+00 | loss scale: 32768.0 | grad norm: 209680.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4432/ 159576 | consumed samples: 94608 | elapsed time per iteration (ms): 15487.4 | learning rate: 2.620E-05 | global batch size: 48 | lm loss: 6.439948E+00 | loss scale: 32768.0 | grad norm: 242738.510 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4433/ 159576 | consumed samples: 94656 | elapsed time per iteration (ms): 15516.6 | learning rate: 2.621E-05 | global batch size: 48 | lm loss: 6.392755E+00 | loss scale: 32768.0 | grad norm: 221617.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4434/ 159576 | consumed samples: 94704 | elapsed time per iteration (ms): 15531.5 | learning rate: 2.622E-05 | global batch size: 48 | lm loss: 6.430658E+00 | loss scale: 32768.0 | grad norm: 237786.421 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4435/ 159576 | consumed samples: 94752 | elapsed time per iteration (ms): 15905.6 | learning rate: 2.624E-05 | global batch size: 48 | lm loss: 6.556681E+00 | loss scale: 32768.0 | grad norm: 268817.064 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4436/ 159576 | consumed samples: 94800 | elapsed time per iteration (ms): 15557.4 | learning rate: 2.625E-05 | global batch size: 48 | lm loss: 6.284402E+00 | loss scale: 32768.0 | grad norm: 217583.284 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4437/ 159576 | consumed samples: 94848 | elapsed time per iteration (ms): 15418.7 | learning rate: 2.626E-05 | global batch size: 48 | lm loss: 6.449813E+00 | loss scale: 32768.0 | grad norm: 250831.113 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4438/ 159576 | consumed samples: 94896 | elapsed time per iteration (ms): 15465.2 | learning rate: 2.628E-05 | global batch size: 48 | lm loss: 6.524204E+00 | loss scale: 32768.0 | grad norm: 237741.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4439/ 159576 | consumed samples: 94944 | elapsed time per iteration (ms): 15664.4 | learning rate: 2.629E-05 | global batch size: 48 | lm loss: 6.426958E+00 | loss scale: 32768.0 | grad norm: 275670.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4440/ 159576 | consumed samples: 94992 | elapsed time per iteration (ms): 15485.6 | learning rate: 2.630E-05 | global batch size: 48 | lm loss: 6.312765E+00 | loss scale: 32768.0 | grad norm: 236643.110 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4441/ 159576 | consumed samples: 95040 | elapsed time per iteration (ms): 15554.2 | learning rate: 2.632E-05 | global batch size: 48 | lm loss: 6.353696E+00 | loss scale: 32768.0 | grad norm: 244108.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4442/ 159576 | consumed samples: 95088 | elapsed time per iteration (ms): 15559.7 | learning rate: 2.633E-05 | global batch size: 48 | lm loss: 6.390371E+00 | loss scale: 32768.0 | grad norm: 415315.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4443/ 159576 | consumed samples: 95136 | elapsed time per iteration (ms): 15762.5 | learning rate: 2.634E-05 | global batch size: 48 | lm loss: 6.406565E+00 | loss scale: 32768.0 | grad norm: 379916.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4444/ 159576 | consumed samples: 95184 | elapsed time per iteration (ms): 15453.3 | learning rate: 2.636E-05 | global batch size: 48 | lm loss: 6.429417E+00 | loss scale: 32768.0 | grad norm: 221219.524 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4445/ 159576 | consumed samples: 95232 | elapsed time per iteration (ms): 15417.8 | learning rate: 2.637E-05 | global batch size: 48 | lm loss: 6.443903E+00 | loss scale: 32768.0 | grad norm: 296633.656 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4446/ 159576 | consumed samples: 95280 | elapsed time per iteration (ms): 15443.7 | learning rate: 2.638E-05 | global batch size: 48 | lm loss: 6.532698E+00 | loss scale: 32768.0 | grad norm: 269367.053 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4447/ 159576 | consumed samples: 95328 | elapsed time per iteration (ms): 15690.5 | learning rate: 2.640E-05 | global batch size: 48 | lm loss: 6.390007E+00 | loss scale: 32768.0 | grad norm: 235234.160 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4448/ 159576 | consumed samples: 95376 | elapsed time per iteration (ms): 15488.0 | learning rate: 2.641E-05 | global batch size: 48 | lm loss: 6.393896E+00 | loss scale: 32768.0 | grad norm: 210963.912 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4449/ 159576 | consumed samples: 95424 | elapsed time per iteration (ms): 15546.6 | learning rate: 2.642E-05 | global batch size: 48 | lm loss: 6.387472E+00 | loss scale: 32768.0 | grad norm: 214989.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4450/ 159576 | consumed samples: 95472 | elapsed time per iteration (ms): 15940.5 | learning rate: 2.644E-05 | global batch size: 48 | lm loss: 6.395288E+00 | loss scale: 32768.0 | grad norm: 214649.184 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4451/ 159576 | consumed samples: 95520 | elapsed time per iteration (ms): 15450.6 | learning rate: 2.645E-05 | global batch size: 48 | lm loss: 6.391924E+00 | loss scale: 32768.0 | grad norm: 256872.340 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4452/ 159576 | consumed samples: 95568 | elapsed time per iteration (ms): 15411.8 | learning rate: 2.646E-05 | global batch size: 48 | lm loss: 6.372116E+00 | loss scale: 32768.0 | grad norm: 227618.006 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4453/ 159576 | consumed samples: 95616 | elapsed time per iteration (ms): 15430.5 | learning rate: 2.648E-05 | global batch size: 48 | lm loss: 6.411846E+00 | loss scale: 32768.0 | grad norm: 239941.344 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4454/ 159576 | consumed samples: 95664 | elapsed time per iteration (ms): 15763.6 | learning rate: 2.649E-05 | global batch size: 48 | lm loss: 6.412562E+00 | loss scale: 32768.0 | grad norm: 229907.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4455/ 159576 | consumed samples: 95712 | elapsed time per iteration (ms): 15524.7 | learning rate: 2.650E-05 | global batch size: 48 | lm loss: 6.428136E+00 | loss scale: 32768.0 | grad norm: 223866.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4456/ 159576 | consumed samples: 95760 | elapsed time per iteration (ms): 15490.3 | learning rate: 2.652E-05 | global batch size: 48 | lm loss: 6.476852E+00 | loss scale: 32768.0 | grad norm: 263813.676 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4457/ 159576 | consumed samples: 95808 | elapsed time per iteration (ms): 15514.4 | learning rate: 2.653E-05 | global batch size: 48 | lm loss: 6.382901E+00 | loss scale: 32768.0 | grad norm: 257590.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4458/ 159576 | consumed samples: 95856 | elapsed time per iteration (ms): 15907.9 | learning rate: 2.654E-05 | global batch size: 48 | lm loss: 6.444118E+00 | loss scale: 32768.0 | grad norm: 236507.018 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4459/ 159576 | consumed samples: 95904 | elapsed time per iteration (ms): 15454.4 | learning rate: 2.656E-05 | global batch size: 48 | lm loss: 6.392717E+00 | loss scale: 32768.0 | grad norm: 227300.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4460/ 159576 | consumed samples: 95952 | elapsed time per iteration (ms): 15435.7 | learning rate: 2.657E-05 | global batch size: 48 | lm loss: 6.375526E+00 | loss scale: 32768.0 | grad norm: 217329.765 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4461/ 159576 | consumed samples: 96000 | elapsed time per iteration (ms): 15463.0 | learning rate: 2.658E-05 | global batch size: 48 | lm loss: 6.442908E+00 | loss scale: 32768.0 | grad norm: 210214.078 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4462/ 159576 | consumed samples: 96048 | elapsed time per iteration (ms): 15890.8 | learning rate: 2.660E-05 | global batch size: 48 | lm loss: 6.347652E+00 | loss scale: 32768.0 | grad norm: 241592.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4463/ 159576 | consumed samples: 96096 | elapsed time per iteration (ms): 15523.3 | learning rate: 2.661E-05 | global batch size: 48 | lm loss: 6.408596E+00 | loss scale: 32768.0 | grad norm: 286741.620 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4464/ 159576 | consumed samples: 96144 | elapsed time per iteration (ms): 15484.1 | learning rate: 2.662E-05 | global batch size: 48 | lm loss: 6.423483E+00 | loss scale: 32768.0 | grad norm: 227347.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4465/ 159576 | consumed samples: 96192 | elapsed time per iteration (ms): 15505.4 | learning rate: 2.664E-05 | global batch size: 48 | lm loss: 6.465323E+00 | loss scale: 32768.0 | grad norm: 278891.247 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4466/ 159576 | consumed samples: 96240 | elapsed time per iteration (ms): 15734.3 | learning rate: 2.665E-05 | global batch size: 48 | lm loss: 6.540909E+00 | loss scale: 32768.0 | grad norm: 271330.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4467/ 159576 | consumed samples: 96288 | elapsed time per iteration (ms): 15463.2 | learning rate: 2.666E-05 | global batch size: 48 | lm loss: 6.366038E+00 | loss scale: 32768.0 | grad norm: 230305.551 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4468/ 159576 | consumed samples: 96336 | elapsed time per iteration (ms): 15456.1 | learning rate: 2.668E-05 | global batch size: 48 | lm loss: 6.383101E+00 | loss scale: 32768.0 | grad norm: 266194.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4469/ 159576 | consumed samples: 96384 | elapsed time per iteration (ms): 15450.4 | learning rate: 2.669E-05 | global batch size: 48 | lm loss: 6.383107E+00 | loss scale: 32768.0 | grad norm: 224990.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4470/ 159576 | consumed samples: 96432 | elapsed time per iteration (ms): 15624.0 | learning rate: 2.670E-05 | global batch size: 48 | lm loss: 6.393697E+00 | loss scale: 32768.0 | grad norm: 301446.071 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4471/ 159576 | consumed samples: 96480 | elapsed time per iteration (ms): 15530.2 | learning rate: 2.672E-05 | global batch size: 48 | lm loss: 6.364079E+00 | loss scale: 32768.0 | grad norm: 215922.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4472/ 159576 | consumed samples: 96528 | elapsed time per iteration (ms): 15512.2 | learning rate: 2.673E-05 | global batch size: 48 | lm loss: 6.373242E+00 | loss scale: 32768.0 | grad norm: 297810.241 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4473/ 159576 | consumed samples: 96576 | elapsed time per iteration (ms): 15493.5 | learning rate: 2.674E-05 | global batch size: 48 | lm loss: 6.458824E+00 | loss scale: 32768.0 | grad norm: 253875.814 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4474/ 159576 | consumed samples: 96624 | elapsed time per iteration (ms): 16109.8 | learning rate: 2.676E-05 | global batch size: 48 | lm loss: 6.444027E+00 | loss scale: 32768.0 | grad norm: 235767.912 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4475/ 159576 | consumed samples: 96672 | elapsed time per iteration (ms): 15442.4 | learning rate: 2.677E-05 | global batch size: 48 | lm loss: 6.379702E+00 | loss scale: 32768.0 | grad norm: 200816.895 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4476/ 159576 | consumed samples: 96720 | elapsed time per iteration (ms): 15439.1 | learning rate: 2.678E-05 | global batch size: 48 | lm loss: 6.460698E+00 | loss scale: 32768.0 | grad norm: 243887.532 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4477/ 159576 | consumed samples: 96768 | elapsed time per iteration (ms): 15842.8 | learning rate: 2.680E-05 | global batch size: 48 | lm loss: 6.425824E+00 | loss scale: 32768.0 | grad norm: 194209.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4478/ 159576 | consumed samples: 96816 | elapsed time per iteration (ms): 15527.8 | learning rate: 2.681E-05 | global batch size: 48 | lm loss: 6.499928E+00 | loss scale: 32768.0 | grad norm: 205164.907 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4479/ 159576 | consumed samples: 96864 | elapsed time per iteration (ms): 15497.3 | learning rate: 2.682E-05 | global batch size: 48 | lm loss: 6.333491E+00 | loss scale: 32768.0 | grad norm: 198136.402 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4480/ 159576 | consumed samples: 96912 | elapsed time per iteration (ms): 15608.5 | learning rate: 2.684E-05 | global batch size: 48 | lm loss: 6.393649E+00 | loss scale: 32768.0 | grad norm: 226765.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4481/ 159576 | consumed samples: 96960 | elapsed time per iteration (ms): 15886.4 | learning rate: 2.685E-05 | global batch size: 48 | lm loss: 6.315465E+00 | loss scale: 32768.0 | grad norm: 233990.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4482/ 159576 | consumed samples: 97008 | elapsed time per iteration (ms): 15388.4 | learning rate: 2.686E-05 | global batch size: 48 | lm loss: 6.467194E+00 | loss scale: 32768.0 | grad norm: 253595.606 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4483/ 159576 | consumed samples: 97056 | elapsed time per iteration (ms): 15452.6 | learning rate: 2.688E-05 | global batch size: 48 | lm loss: 6.424766E+00 | loss scale: 32768.0 | grad norm: 243792.882 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4484/ 159576 | consumed samples: 97104 | elapsed time per iteration (ms): 15440.8 | learning rate: 2.689E-05 | global batch size: 48 | lm loss: 6.382202E+00 | loss scale: 32768.0 | grad norm: 253619.641 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4485/ 159576 | consumed samples: 97152 | elapsed time per iteration (ms): 15758.4 | learning rate: 2.690E-05 | global batch size: 48 | lm loss: 6.420368E+00 | loss scale: 32768.0 | grad norm: 270122.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4486/ 159576 | consumed samples: 97200 | elapsed time per iteration (ms): 15504.2 | learning rate: 2.692E-05 | global batch size: 48 | lm loss: 6.341059E+00 | loss scale: 32768.0 | grad norm: 264076.223 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4487/ 159576 | consumed samples: 97248 | elapsed time per iteration (ms): 15564.4 | learning rate: 2.693E-05 | global batch size: 48 | lm loss: 6.351835E+00 | loss scale: 32768.0 | grad norm: 254803.371 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4488/ 159576 | consumed samples: 97296 | elapsed time per iteration (ms): 15603.6 | learning rate: 2.694E-05 | global batch size: 48 | lm loss: 6.344017E+00 | loss scale: 32768.0 | grad norm: 244790.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4489/ 159576 | consumed samples: 97344 | elapsed time per iteration (ms): 15804.2 | learning rate: 2.696E-05 | global batch size: 48 | lm loss: 6.487484E+00 | loss scale: 32768.0 | grad norm: 242539.962 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4490/ 159576 | consumed samples: 97392 | elapsed time per iteration (ms): 15547.3 | learning rate: 2.697E-05 | global batch size: 48 | lm loss: 6.339984E+00 | loss scale: 32768.0 | grad norm: 225575.703 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4491/ 159576 | consumed samples: 97440 | elapsed time per iteration (ms): 15475.7 | learning rate: 2.698E-05 | global batch size: 48 | lm loss: 6.449341E+00 | loss scale: 32768.0 | grad norm: 205395.664 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4492/ 159576 | consumed samples: 97488 | elapsed time per iteration (ms): 15436.0 | learning rate: 2.700E-05 | global batch size: 48 | lm loss: 6.382250E+00 | loss scale: 32768.0 | grad norm: 234078.700 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4493/ 159576 | consumed samples: 97536 | elapsed time per iteration (ms): 15764.8 | learning rate: 2.701E-05 | global batch size: 48 | lm loss: 6.425200E+00 | loss scale: 32768.0 | grad norm: 247476.491 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4494/ 159576 | consumed samples: 97584 | elapsed time per iteration (ms): 15532.5 | learning rate: 2.702E-05 | global batch size: 48 | lm loss: 6.381852E+00 | loss scale: 32768.0 | grad norm: 242648.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4495/ 159576 | consumed samples: 97632 | elapsed time per iteration (ms): 15533.1 | learning rate: 2.704E-05 | global batch size: 48 | lm loss: 6.230868E+00 | loss scale: 32768.0 | grad norm: 219731.182 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4496/ 159576 | consumed samples: 97680 | elapsed time per iteration (ms): 15535.3 | learning rate: 2.705E-05 | global batch size: 48 | lm loss: 6.353293E+00 | loss scale: 32768.0 | grad norm: 216013.670 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4497/ 159576 | consumed samples: 97728 | elapsed time per iteration (ms): 15701.4 | learning rate: 2.706E-05 | global batch size: 48 | lm loss: 6.307485E+00 | loss scale: 32768.0 | grad norm: 253204.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4498/ 159576 | consumed samples: 97776 | elapsed time per iteration (ms): 15494.7 | learning rate: 2.708E-05 | global batch size: 48 | lm loss: 6.438371E+00 | loss scale: 32768.0 | grad norm: 214787.319 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4499/ 159576 | consumed samples: 97824 | elapsed time per iteration (ms): 15439.3 | learning rate: 2.709E-05 | global batch size: 48 | lm loss: 6.311467E+00 | loss scale: 32768.0 | grad norm: 216077.744 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4500/ 159576 | consumed samples: 97872 | elapsed time per iteration (ms): 15685.9 | learning rate: 2.710E-05 | global batch size: 48 | lm loss: 6.423208E+00 | loss scale: 32768.0 | grad norm: 207994.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 4500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-24 19:52:27,219] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step4500/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 4500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 17750.13 - iteration 4501/ 159576 | consumed samples: 97920 | elapsed time per iteration (ms): 33282.4 | learning rate: 2.712E-05 | global batch size: 48 | lm loss: 6.402827E+00 | loss scale: 32768.0 | grad norm: 259915.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4502/ 159576 | consumed samples: 97968 | elapsed time per iteration (ms): 15581.1 | learning rate: 2.713E-05 | global batch size: 48 | lm loss: 6.310410E+00 | loss scale: 32768.0 | grad norm: 222384.313 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4503/ 159576 | consumed samples: 98016 | elapsed time per iteration (ms): 15856.7 | learning rate: 2.714E-05 | global batch size: 48 | lm loss: 6.259107E+00 | loss scale: 32768.0 | grad norm: 219981.429 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4504/ 159576 | consumed samples: 98064 | elapsed time per iteration (ms): 15522.8 | learning rate: 2.716E-05 | global batch size: 48 | lm loss: 6.441791E+00 | loss scale: 32768.0 | grad norm: 235487.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4505/ 159576 | consumed samples: 98112 | elapsed time per iteration (ms): 15475.3 | learning rate: 2.717E-05 | global batch size: 48 | lm loss: 6.431644E+00 | loss scale: 32768.0 | grad norm: 308152.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4506/ 159576 | consumed samples: 98160 | elapsed time per iteration (ms): 15475.2 | learning rate: 2.718E-05 | global batch size: 48 | lm loss: 6.437158E+00 | loss scale: 32768.0 | grad norm: 223087.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4507/ 159576 | consumed samples: 98208 | elapsed time per iteration (ms): 15919.3 | learning rate: 2.720E-05 | global batch size: 48 | lm loss: 6.456445E+00 | loss scale: 32768.0 | grad norm: 223422.565 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4508/ 159576 | consumed samples: 98256 | elapsed time per iteration (ms): 15503.1 | learning rate: 2.721E-05 | global batch size: 48 | lm loss: 6.409997E+00 | loss scale: 32768.0 | grad norm: 245785.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4509/ 159576 | consumed samples: 98304 | elapsed time per iteration (ms): 15512.1 | learning rate: 2.722E-05 | global batch size: 48 | lm loss: 6.441339E+00 | loss scale: 32768.0 | grad norm: 283619.839 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4510/ 159576 | consumed samples: 98352 | elapsed time per iteration (ms): 15548.0 | learning rate: 2.724E-05 | global batch size: 48 | lm loss: 6.441983E+00 | loss scale: 32768.0 | grad norm: 235037.042 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4511/ 159576 | consumed samples: 98400 | elapsed time per iteration (ms): 15735.6 | learning rate: 2.725E-05 | global batch size: 48 | lm loss: 6.499406E+00 | loss scale: 32768.0 | grad norm: 238925.774 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4512/ 159576 | consumed samples: 98448 | elapsed time per iteration (ms): 15495.6 | learning rate: 2.726E-05 | global batch size: 48 | lm loss: 6.429494E+00 | loss scale: 32768.0 | grad norm: 295604.429 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4513/ 159576 | consumed samples: 98496 | elapsed time per iteration (ms): 15481.9 | learning rate: 2.728E-05 | global batch size: 48 | lm loss: 6.407839E+00 | loss scale: 32768.0 | grad norm: 292842.531 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4514/ 159576 | consumed samples: 98544 | elapsed time per iteration (ms): 15479.3 | learning rate: 2.729E-05 | global batch size: 48 | lm loss: 6.440022E+00 | loss scale: 32768.0 | grad norm: 270315.805 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4515/ 159576 | consumed samples: 98592 | elapsed time per iteration (ms): 15606.8 | learning rate: 2.730E-05 | global batch size: 48 | lm loss: 6.391658E+00 | loss scale: 32768.0 | grad norm: 271519.155 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4516/ 159576 | consumed samples: 98640 | elapsed time per iteration (ms): 15492.8 | learning rate: 2.732E-05 | global batch size: 48 | lm loss: 6.445361E+00 | loss scale: 32768.0 | grad norm: 235853.751 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4517/ 159576 | consumed samples: 98688 | elapsed time per iteration (ms): 15525.5 | learning rate: 2.733E-05 | global batch size: 48 | lm loss: 6.274318E+00 | loss scale: 32768.0 | grad norm: 246250.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4518/ 159576 | consumed samples: 98736 | elapsed time per iteration (ms): 15595.2 | learning rate: 2.734E-05 | global batch size: 48 | lm loss: 6.378585E+00 | loss scale: 32768.0 | grad norm: 262163.945 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4519/ 159576 | consumed samples: 98784 | elapsed time per iteration (ms): 15657.4 | learning rate: 2.736E-05 | global batch size: 48 | lm loss: 6.398365E+00 | loss scale: 32768.0 | grad norm: 339087.705 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4520/ 159576 | consumed samples: 98832 | elapsed time per iteration (ms): 15503.5 | learning rate: 2.737E-05 | global batch size: 48 | lm loss: 6.435692E+00 | loss scale: 32768.0 | grad norm: 219944.197 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4521/ 159576 | consumed samples: 98880 | elapsed time per iteration (ms): 15444.3 | learning rate: 2.738E-05 | global batch size: 48 | lm loss: 6.418158E+00 | loss scale: 32768.0 | grad norm: 295809.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4522/ 159576 | consumed samples: 98928 | elapsed time per iteration (ms): 15726.5 | learning rate: 2.739E-05 | global batch size: 48 | lm loss: 6.317287E+00 | loss scale: 32768.0 | grad norm: 256139.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4523/ 159576 | consumed samples: 98976 | elapsed time per iteration (ms): 15697.5 | learning rate: 2.741E-05 | global batch size: 48 | lm loss: 6.210083E+00 | loss scale: 32768.0 | grad norm: 222390.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4524/ 159576 | consumed samples: 99024 | elapsed time per iteration (ms): 15483.9 | learning rate: 2.742E-05 | global batch size: 48 | lm loss: 6.357608E+00 | loss scale: 32768.0 | grad norm: 250631.340 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4525/ 159576 | consumed samples: 99072 | elapsed time per iteration (ms): 15498.9 | learning rate: 2.743E-05 | global batch size: 48 | lm loss: 6.439158E+00 | loss scale: 32768.0 | grad norm: 237183.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4526/ 159576 | consumed samples: 99120 | elapsed time per iteration (ms): 15870.3 | learning rate: 2.745E-05 | global batch size: 48 | lm loss: 6.477302E+00 | loss scale: 32768.0 | grad norm: 234590.425 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4527/ 159576 | consumed samples: 99168 | elapsed time per iteration (ms): 15527.5 | learning rate: 2.746E-05 | global batch size: 48 | lm loss: 6.404512E+00 | loss scale: 32768.0 | grad norm: 268737.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4528/ 159576 | consumed samples: 99216 | elapsed time per iteration (ms): 15477.7 | learning rate: 2.747E-05 | global batch size: 48 | lm loss: 6.357052E+00 | loss scale: 32768.0 | grad norm: 199055.934 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4529/ 159576 | consumed samples: 99264 | elapsed time per iteration (ms): 15441.0 | learning rate: 2.749E-05 | global batch size: 48 | lm loss: 6.418729E+00 | loss scale: 32768.0 | grad norm: 280337.259 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4530/ 159576 | consumed samples: 99312 | elapsed time per iteration (ms): 15870.6 | learning rate: 2.750E-05 | global batch size: 48 | lm loss: 6.394526E+00 | loss scale: 32768.0 | grad norm: 242159.812 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4531/ 159576 | consumed samples: 99360 | elapsed time per iteration (ms): 15356.1 | learning rate: 2.751E-05 | global batch size: 48 | lm loss: 6.454551E+00 | loss scale: 32768.0 | grad norm: 238356.429 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4532/ 159576 | consumed samples: 99408 | elapsed time per iteration (ms): 15481.2 | learning rate: 2.753E-05 | global batch size: 48 | lm loss: 6.479828E+00 | loss scale: 32768.0 | grad norm: 256781.681 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4533/ 159576 | consumed samples: 99456 | elapsed time per iteration (ms): 15512.7 | learning rate: 2.754E-05 | global batch size: 48 | lm loss: 6.347847E+00 | loss scale: 32768.0 | grad norm: 232593.280 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4534/ 159576 | consumed samples: 99504 | elapsed time per iteration (ms): 16020.6 | learning rate: 2.755E-05 | global batch size: 48 | lm loss: 6.361287E+00 | loss scale: 32768.0 | grad norm: 214859.706 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4535/ 159576 | consumed samples: 99552 | elapsed time per iteration (ms): 15687.2 | learning rate: 2.757E-05 | global batch size: 48 | lm loss: 6.344873E+00 | loss scale: 32768.0 | grad norm: 214653.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4536/ 159576 | consumed samples: 99600 | elapsed time per iteration (ms): 15424.3 | learning rate: 2.758E-05 | global batch size: 48 | lm loss: 6.273855E+00 | loss scale: 32768.0 | grad norm: 249309.228 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4537/ 159576 | consumed samples: 99648 | elapsed time per iteration (ms): 15440.3 | learning rate: 2.759E-05 | global batch size: 48 | lm loss: 6.373835E+00 | loss scale: 32768.0 | grad norm: 230963.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4538/ 159576 | consumed samples: 99696 | elapsed time per iteration (ms): 15788.5 | learning rate: 2.761E-05 | global batch size: 48 | lm loss: 6.381639E+00 | loss scale: 32768.0 | grad norm: 258586.304 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4539/ 159576 | consumed samples: 99744 | elapsed time per iteration (ms): 15436.7 | learning rate: 2.762E-05 | global batch size: 48 | lm loss: 6.464207E+00 | loss scale: 32768.0 | grad norm: 260715.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4540/ 159576 | consumed samples: 99792 | elapsed time per iteration (ms): 15631.9 | learning rate: 2.763E-05 | global batch size: 48 | lm loss: 6.282461E+00 | loss scale: 32768.0 | grad norm: 271394.559 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4541/ 159576 | consumed samples: 99840 | elapsed time per iteration (ms): 15417.1 | learning rate: 2.765E-05 | global batch size: 48 | lm loss: 6.323977E+00 | loss scale: 32768.0 | grad norm: 268740.684 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4542/ 159576 | consumed samples: 99888 | elapsed time per iteration (ms): 15726.7 | learning rate: 2.766E-05 | global batch size: 48 | lm loss: 6.419955E+00 | loss scale: 32768.0 | grad norm: 270171.155 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4543/ 159576 | consumed samples: 99936 | elapsed time per iteration (ms): 15524.6 | learning rate: 2.767E-05 | global batch size: 48 | lm loss: 6.456992E+00 | loss scale: 32768.0 | grad norm: 255182.014 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4544/ 159576 | consumed samples: 99984 | elapsed time per iteration (ms): 15442.0 | learning rate: 2.769E-05 | global batch size: 48 | lm loss: 6.327838E+00 | loss scale: 32768.0 | grad norm: 224129.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4545/ 159576 | consumed samples: 100032 | elapsed time per iteration (ms): 15419.1 | learning rate: 2.770E-05 | global batch size: 48 | lm loss: 6.374109E+00 | loss scale: 32768.0 | grad norm: 265872.290 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4546/ 159576 | consumed samples: 100080 | elapsed time per iteration (ms): 15626.3 | learning rate: 2.771E-05 | global batch size: 48 | lm loss: 6.332025E+00 | loss scale: 32768.0 | grad norm: 221965.501 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4547/ 159576 | consumed samples: 100128 | elapsed time per iteration (ms): 15454.8 | learning rate: 2.773E-05 | global batch size: 48 | lm loss: 6.399364E+00 | loss scale: 32768.0 | grad norm: 257839.194 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4548/ 159576 | consumed samples: 100176 | elapsed time per iteration (ms): 15431.4 | learning rate: 2.774E-05 | global batch size: 48 | lm loss: 6.411947E+00 | loss scale: 32768.0 | grad norm: 278135.374 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4549/ 159576 | consumed samples: 100224 | elapsed time per iteration (ms): 15844.6 | learning rate: 2.775E-05 | global batch size: 48 | lm loss: 6.477700E+00 | loss scale: 32768.0 | grad norm: 277855.734 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4550/ 159576 | consumed samples: 100272 | elapsed time per iteration (ms): 15537.3 | learning rate: 2.777E-05 | global batch size: 48 | lm loss: 6.526390E+00 | loss scale: 32768.0 | grad norm: 246063.278 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4551/ 159576 | consumed samples: 100320 | elapsed time per iteration (ms): 15431.5 | learning rate: 2.778E-05 | global batch size: 48 | lm loss: 6.391055E+00 | loss scale: 32768.0 | grad norm: 230174.666 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4552/ 159576 | consumed samples: 100368 | elapsed time per iteration (ms): 15392.1 | learning rate: 2.779E-05 | global batch size: 48 | lm loss: 6.381279E+00 | loss scale: 32768.0 | grad norm: 230427.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4553/ 159576 | consumed samples: 100416 | elapsed time per iteration (ms): 15770.9 | learning rate: 2.781E-05 | global batch size: 48 | lm loss: 6.438869E+00 | loss scale: 32768.0 | grad norm: 230488.615 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4554/ 159576 | consumed samples: 100464 | elapsed time per iteration (ms): 15447.7 | learning rate: 2.782E-05 | global batch size: 48 | lm loss: 6.245214E+00 | loss scale: 32768.0 | grad norm: 277295.687 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4555/ 159576 | consumed samples: 100512 | elapsed time per iteration (ms): 15446.9 | learning rate: 2.783E-05 | global batch size: 48 | lm loss: 6.413427E+00 | loss scale: 32768.0 | grad norm: 223183.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 20:07:07] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 20:07:07] PULSE: tr8-104B is running for 14:14:56 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 4556/ 159576 | consumed samples: 100560 | elapsed time per iteration (ms): 15400.2 | learning rate: 2.785E-05 | global batch size: 48 | lm loss: 6.398170E+00 | loss scale: 32768.0 | grad norm: 233778.721 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4557/ 159576 | consumed samples: 100608 | elapsed time per iteration (ms): 15788.3 | learning rate: 2.786E-05 | global batch size: 48 | lm loss: 6.417650E+00 | loss scale: 32768.0 | grad norm: 311870.109 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4558/ 159576 | consumed samples: 100656 | elapsed time per iteration (ms): 15428.6 | learning rate: 2.787E-05 | global batch size: 48 | lm loss: 6.394480E+00 | loss scale: 32768.0 | grad norm: 234331.495 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4559/ 159576 | consumed samples: 100704 | elapsed time per iteration (ms): 15432.2 | learning rate: 2.789E-05 | global batch size: 48 | lm loss: 6.379920E+00 | loss scale: 32768.0 | grad norm: 256774.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4560/ 159576 | consumed samples: 100752 | elapsed time per iteration (ms): 15427.3 | learning rate: 2.790E-05 | global batch size: 48 | lm loss: 6.398593E+00 | loss scale: 32768.0 | grad norm: 244274.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4561/ 159576 | consumed samples: 100800 | elapsed time per iteration (ms): 15906.6 | learning rate: 2.791E-05 | global batch size: 48 | lm loss: 6.370606E+00 | loss scale: 32768.0 | grad norm: 239881.224 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4562/ 159576 | consumed samples: 100848 | elapsed time per iteration (ms): 15436.7 | learning rate: 2.793E-05 | global batch size: 48 | lm loss: 6.449897E+00 | loss scale: 32768.0 | grad norm: 244189.290 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4563/ 159576 | consumed samples: 100896 | elapsed time per iteration (ms): 15423.9 | learning rate: 2.794E-05 | global batch size: 48 | lm loss: 6.361297E+00 | loss scale: 32768.0 | grad norm: 214769.520 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4564/ 159576 | consumed samples: 100944 | elapsed time per iteration (ms): 15485.4 | learning rate: 2.795E-05 | global batch size: 48 | lm loss: 6.315623E+00 | loss scale: 32768.0 | grad norm: 238075.723 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4565/ 159576 | consumed samples: 100992 | elapsed time per iteration (ms): 15712.7 | learning rate: 2.797E-05 | global batch size: 48 | lm loss: 6.407779E+00 | loss scale: 32768.0 | grad norm: 219946.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4566/ 159576 | consumed samples: 101040 | elapsed time per iteration (ms): 15450.4 | learning rate: 2.798E-05 | global batch size: 48 | lm loss: 6.417436E+00 | loss scale: 32768.0 | grad norm: 240930.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4567/ 159576 | consumed samples: 101088 | elapsed time per iteration (ms): 15429.7 | learning rate: 2.799E-05 | global batch size: 48 | lm loss: 6.436010E+00 | loss scale: 32768.0 | grad norm: 314077.087 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4568/ 159576 | consumed samples: 101136 | elapsed time per iteration (ms): 15422.9 | learning rate: 2.801E-05 | global batch size: 48 | lm loss: 6.520737E+00 | loss scale: 32768.0 | grad norm: 274297.002 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4569/ 159576 | consumed samples: 101184 | elapsed time per iteration (ms): 15586.4 | learning rate: 2.802E-05 | global batch size: 48 | lm loss: 6.416994E+00 | loss scale: 32768.0 | grad norm: 231703.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4570/ 159576 | consumed samples: 101232 | elapsed time per iteration (ms): 15422.0 | learning rate: 2.803E-05 | global batch size: 48 | lm loss: 6.319811E+00 | loss scale: 32768.0 | grad norm: 231530.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4571/ 159576 | consumed samples: 101280 | elapsed time per iteration (ms): 15338.3 | learning rate: 2.805E-05 | global batch size: 48 | lm loss: 6.400026E+00 | loss scale: 32768.0 | grad norm: 257733.850 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4572/ 159576 | consumed samples: 101328 | elapsed time per iteration (ms): 15446.6 | learning rate: 2.806E-05 | global batch size: 48 | lm loss: 6.435762E+00 | loss scale: 32768.0 | grad norm: 268511.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4573/ 159576 | consumed samples: 101376 | elapsed time per iteration (ms): 15589.8 | learning rate: 2.807E-05 | global batch size: 48 | lm loss: 6.406414E+00 | loss scale: 32768.0 | grad norm: 233768.669 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4574/ 159576 | consumed samples: 101424 | elapsed time per iteration (ms): 15349.3 | learning rate: 2.809E-05 | global batch size: 48 | lm loss: 6.437346E+00 | loss scale: 32768.0 | grad norm: 269214.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4575/ 159576 | consumed samples: 101472 | elapsed time per iteration (ms): 15388.4 | learning rate: 2.810E-05 | global batch size: 48 | lm loss: 6.352981E+00 | loss scale: 32768.0 | grad norm: 243418.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4576/ 159576 | consumed samples: 101520 | elapsed time per iteration (ms): 15469.0 | learning rate: 2.811E-05 | global batch size: 48 | lm loss: 6.355519E+00 | loss scale: 32768.0 | grad norm: 255521.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4577/ 159576 | consumed samples: 101568 | elapsed time per iteration (ms): 15986.1 | learning rate: 2.813E-05 | global batch size: 48 | lm loss: 6.380365E+00 | loss scale: 32768.0 | grad norm: 263123.213 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4578/ 159576 | consumed samples: 101616 | elapsed time per iteration (ms): 15483.5 | learning rate: 2.814E-05 | global batch size: 48 | lm loss: 6.442792E+00 | loss scale: 32768.0 | grad norm: 264664.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4579/ 159576 | consumed samples: 101664 | elapsed time per iteration (ms): 15482.0 | learning rate: 2.815E-05 | global batch size: 48 | lm loss: 6.300795E+00 | loss scale: 32768.0 | grad norm: 263093.923 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4580/ 159576 | consumed samples: 101712 | elapsed time per iteration (ms): 15915.5 | learning rate: 2.817E-05 | global batch size: 48 | lm loss: 6.509340E+00 | loss scale: 32768.0 | grad norm: 325066.014 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4581/ 159576 | consumed samples: 101760 | elapsed time per iteration (ms): 15478.8 | learning rate: 2.818E-05 | global batch size: 48 | lm loss: 6.417569E+00 | loss scale: 32768.0 | grad norm: 317932.491 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4582/ 159576 | consumed samples: 101808 | elapsed time per iteration (ms): 15467.6 | learning rate: 2.819E-05 | global batch size: 48 | lm loss: 6.391977E+00 | loss scale: 32768.0 | grad norm: 265433.359 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4583/ 159576 | consumed samples: 101856 | elapsed time per iteration (ms): 15463.2 | learning rate: 2.821E-05 | global batch size: 48 | lm loss: 6.493138E+00 | loss scale: 32768.0 | grad norm: 262301.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4584/ 159576 | consumed samples: 101904 | elapsed time per iteration (ms): 15787.5 | learning rate: 2.822E-05 | global batch size: 48 | lm loss: 6.358137E+00 | loss scale: 32768.0 | grad norm: 302003.298 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4585/ 159576 | consumed samples: 101952 | elapsed time per iteration (ms): 15486.8 | learning rate: 2.823E-05 | global batch size: 48 | lm loss: 6.398649E+00 | loss scale: 32768.0 | grad norm: 241427.078 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4586/ 159576 | consumed samples: 102000 | elapsed time per iteration (ms): 15502.1 | learning rate: 2.825E-05 | global batch size: 48 | lm loss: 6.450002E+00 | loss scale: 32768.0 | grad norm: 288231.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4587/ 159576 | consumed samples: 102048 | elapsed time per iteration (ms): 15613.4 | learning rate: 2.826E-05 | global batch size: 48 | lm loss: 6.463566E+00 | loss scale: 32768.0 | grad norm: 255700.156 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4588/ 159576 | consumed samples: 102096 | elapsed time per iteration (ms): 16100.7 | learning rate: 2.827E-05 | global batch size: 48 | lm loss: 6.440113E+00 | loss scale: 32768.0 | grad norm: 228589.163 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4589/ 159576 | consumed samples: 102144 | elapsed time per iteration (ms): 15550.6 | learning rate: 2.829E-05 | global batch size: 48 | lm loss: 6.330764E+00 | loss scale: 32768.0 | grad norm: 253562.437 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4590/ 159576 | consumed samples: 102192 | elapsed time per iteration (ms): 15504.0 | learning rate: 2.830E-05 | global batch size: 48 | lm loss: 6.565317E+00 | loss scale: 32768.0 | grad norm: 248109.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4591/ 159576 | consumed samples: 102240 | elapsed time per iteration (ms): 15500.8 | learning rate: 2.831E-05 | global batch size: 48 | lm loss: 6.432470E+00 | loss scale: 32768.0 | grad norm: 258408.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4592/ 159576 | consumed samples: 102288 | elapsed time per iteration (ms): 15682.0 | learning rate: 2.833E-05 | global batch size: 48 | lm loss: 6.388723E+00 | loss scale: 32768.0 | grad norm: 255460.696 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4593/ 159576 | consumed samples: 102336 | elapsed time per iteration (ms): 15624.8 | learning rate: 2.834E-05 | global batch size: 48 | lm loss: 6.252523E+00 | loss scale: 32768.0 | grad norm: 247063.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4594/ 159576 | consumed samples: 102384 | elapsed time per iteration (ms): 15619.9 | learning rate: 2.835E-05 | global batch size: 48 | lm loss: 6.256584E+00 | loss scale: 32768.0 | grad norm: 252094.746 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4595/ 159576 | consumed samples: 102432 | elapsed time per iteration (ms): 15618.3 | learning rate: 2.837E-05 | global batch size: 48 | lm loss: 6.422144E+00 | loss scale: 32768.0 | grad norm: 327415.393 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4596/ 159576 | consumed samples: 102480 | elapsed time per iteration (ms): 15731.1 | learning rate: 2.838E-05 | global batch size: 48 | lm loss: 6.362859E+00 | loss scale: 32768.0 | grad norm: 271628.783 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4597/ 159576 | consumed samples: 102528 | elapsed time per iteration (ms): 15470.5 | learning rate: 2.839E-05 | global batch size: 48 | lm loss: 6.400634E+00 | loss scale: 32768.0 | grad norm: 270235.866 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4598/ 159576 | consumed samples: 102576 | elapsed time per iteration (ms): 15494.8 | learning rate: 2.841E-05 | global batch size: 48 | lm loss: 6.409593E+00 | loss scale: 32768.0 | grad norm: 246051.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4599/ 159576 | consumed samples: 102624 | elapsed time per iteration (ms): 15503.4 | learning rate: 2.842E-05 | global batch size: 48 | lm loss: 6.286301E+00 | loss scale: 32768.0 | grad norm: 315951.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4600/ 159576 | consumed samples: 102672 | elapsed time per iteration (ms): 15657.8 | learning rate: 2.843E-05 | global batch size: 48 | lm loss: 6.424391E+00 | loss scale: 32768.0 | grad norm: 257970.239 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4601/ 159576 | consumed samples: 102720 | elapsed time per iteration (ms): 15415.9 | learning rate: 2.845E-05 | global batch size: 48 | lm loss: 6.419086E+00 | loss scale: 32768.0 | grad norm: 232614.820 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4602/ 159576 | consumed samples: 102768 | elapsed time per iteration (ms): 15506.4 | learning rate: 2.846E-05 | global batch size: 48 | lm loss: 6.598701E+00 | loss scale: 32768.0 | grad norm: 269465.797 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4603/ 159576 | consumed samples: 102816 | elapsed time per iteration (ms): 15842.0 | learning rate: 2.847E-05 | global batch size: 48 | lm loss: 6.374152E+00 | loss scale: 32768.0 | grad norm: 256871.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4604/ 159576 | consumed samples: 102864 | elapsed time per iteration (ms): 15661.0 | learning rate: 2.849E-05 | global batch size: 48 | lm loss: 6.330672E+00 | loss scale: 32768.0 | grad norm: 261276.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4605/ 159576 | consumed samples: 102912 | elapsed time per iteration (ms): 15453.1 | learning rate: 2.850E-05 | global batch size: 48 | lm loss: 6.409989E+00 | loss scale: 32768.0 | grad norm: 213427.896 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4606/ 159576 | consumed samples: 102960 | elapsed time per iteration (ms): 15529.1 | learning rate: 2.851E-05 | global batch size: 48 | lm loss: 6.409967E+00 | loss scale: 32768.0 | grad norm: 343079.843 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4607/ 159576 | consumed samples: 103008 | elapsed time per iteration (ms): 15784.9 | learning rate: 2.853E-05 | global batch size: 48 | lm loss: 6.345381E+00 | loss scale: 32768.0 | grad norm: 288014.524 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4608/ 159576 | consumed samples: 103056 | elapsed time per iteration (ms): 15407.4 | learning rate: 2.854E-05 | global batch size: 48 | lm loss: 6.160167E+00 | loss scale: 32768.0 | grad norm: 236948.790 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4609/ 159576 | consumed samples: 103104 | elapsed time per iteration (ms): 15521.9 | learning rate: 2.855E-05 | global batch size: 48 | lm loss: 6.368454E+00 | loss scale: 32768.0 | grad norm: 346716.620 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4610/ 159576 | consumed samples: 103152 | elapsed time per iteration (ms): 15546.6 | learning rate: 2.857E-05 | global batch size: 48 | lm loss: 6.485950E+00 | loss scale: 32768.0 | grad norm: 249193.625 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4611/ 159576 | consumed samples: 103200 | elapsed time per iteration (ms): 15842.5 | learning rate: 2.858E-05 | global batch size: 48 | lm loss: 6.433112E+00 | loss scale: 32768.0 | grad norm: 245691.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4612/ 159576 | consumed samples: 103248 | elapsed time per iteration (ms): 15452.2 | learning rate: 2.859E-05 | global batch size: 48 | lm loss: 6.453573E+00 | loss scale: 32768.0 | grad norm: 326844.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4613/ 159576 | consumed samples: 103296 | elapsed time per iteration (ms): 15454.7 | learning rate: 2.861E-05 | global batch size: 48 | lm loss: 6.431165E+00 | loss scale: 32768.0 | grad norm: 289334.369 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4614/ 159576 | consumed samples: 103344 | elapsed time per iteration (ms): 15458.5 | learning rate: 2.862E-05 | global batch size: 48 | lm loss: 6.229577E+00 | loss scale: 32768.0 | grad norm: 256574.569 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4615/ 159576 | consumed samples: 103392 | elapsed time per iteration (ms): 15900.6 | learning rate: 2.863E-05 | global batch size: 48 | lm loss: 6.432065E+00 | loss scale: 32768.0 | grad norm: 273324.041 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4616/ 159576 | consumed samples: 103440 | elapsed time per iteration (ms): 15568.2 | learning rate: 2.865E-05 | global batch size: 48 | lm loss: 6.373868E+00 | loss scale: 32768.0 | grad norm: 289471.232 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4617/ 159576 | consumed samples: 103488 | elapsed time per iteration (ms): 15491.7 | learning rate: 2.866E-05 | global batch size: 48 | lm loss: 6.302549E+00 | loss scale: 32768.0 | grad norm: 421148.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4618/ 159576 | consumed samples: 103536 | elapsed time per iteration (ms): 15549.9 | learning rate: 2.867E-05 | global batch size: 48 | lm loss: 6.278319E+00 | loss scale: 32768.0 | grad norm: 346570.622 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4619/ 159576 | consumed samples: 103584 | elapsed time per iteration (ms): 15749.4 | learning rate: 2.869E-05 | global batch size: 48 | lm loss: 6.394638E+00 | loss scale: 32768.0 | grad norm: 356110.872 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4620/ 159576 | consumed samples: 103632 | elapsed time per iteration (ms): 15472.2 | learning rate: 2.870E-05 | global batch size: 48 | lm loss: 6.303448E+00 | loss scale: 32768.0 | grad norm: 328724.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4621/ 159576 | consumed samples: 103680 | elapsed time per iteration (ms): 15427.3 | learning rate: 2.871E-05 | global batch size: 48 | lm loss: 6.544609E+00 | loss scale: 32768.0 | grad norm: 324100.834 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4622/ 159576 | consumed samples: 103728 | elapsed time per iteration (ms): 15472.5 | learning rate: 2.873E-05 | global batch size: 48 | lm loss: 6.314513E+00 | loss scale: 32768.0 | grad norm: 275878.819 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4623/ 159576 | consumed samples: 103776 | elapsed time per iteration (ms): 15583.2 | learning rate: 2.874E-05 | global batch size: 48 | lm loss: 6.398262E+00 | loss scale: 32768.0 | grad norm: 263126.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4624/ 159576 | consumed samples: 103824 | elapsed time per iteration (ms): 15483.7 | learning rate: 2.875E-05 | global batch size: 48 | lm loss: 6.474843E+00 | loss scale: 32768.0 | grad norm: 242329.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4625/ 159576 | consumed samples: 103872 | elapsed time per iteration (ms): 15477.6 | learning rate: 2.877E-05 | global batch size: 48 | lm loss: 6.408014E+00 | loss scale: 32768.0 | grad norm: 267696.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4626/ 159576 | consumed samples: 103920 | elapsed time per iteration (ms): 15516.2 | learning rate: 2.878E-05 | global batch size: 48 | lm loss: 6.847461E+00 | loss scale: 32768.0 | grad norm: 713094.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4627/ 159576 | consumed samples: 103968 | elapsed time per iteration (ms): 15724.2 | learning rate: 2.879E-05 | global batch size: 48 | lm loss: 6.386415E+00 | loss scale: 32768.0 | grad norm: 272846.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4628/ 159576 | consumed samples: 104016 | elapsed time per iteration (ms): 15456.1 | learning rate: 2.881E-05 | global batch size: 48 | lm loss: 6.446278E+00 | loss scale: 32768.0 | grad norm: 379795.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4629/ 159576 | consumed samples: 104064 | elapsed time per iteration (ms): 15435.5 | learning rate: 2.882E-05 | global batch size: 48 | lm loss: 6.469239E+00 | loss scale: 32768.0 | grad norm: 207715.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4630/ 159576 | consumed samples: 104112 | elapsed time per iteration (ms): 15698.1 | learning rate: 2.883E-05 | global batch size: 48 | lm loss: 6.357453E+00 | loss scale: 32768.0 | grad norm: 236792.203 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4631/ 159576 | consumed samples: 104160 | elapsed time per iteration (ms): 15489.5 | learning rate: 2.885E-05 | global batch size: 48 | lm loss: 6.448473E+00 | loss scale: 32768.0 | grad norm: 225431.411 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4632/ 159576 | consumed samples: 104208 | elapsed time per iteration (ms): 15562.5 | learning rate: 2.886E-05 | global batch size: 48 | lm loss: 6.377034E+00 | loss scale: 32768.0 | grad norm: 375353.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4633/ 159576 | consumed samples: 104256 | elapsed time per iteration (ms): 15569.5 | learning rate: 2.887E-05 | global batch size: 48 | lm loss: 6.516908E+00 | loss scale: 32768.0 | grad norm: 333588.373 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4634/ 159576 | consumed samples: 104304 | elapsed time per iteration (ms): 15928.9 | learning rate: 2.889E-05 | global batch size: 48 | lm loss: 6.574339E+00 | loss scale: 32768.0 | grad norm: 243589.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4635/ 159576 | consumed samples: 104352 | elapsed time per iteration (ms): 15531.5 | learning rate: 2.890E-05 | global batch size: 48 | lm loss: 6.475029E+00 | loss scale: 32768.0 | grad norm: 442923.681 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4636/ 159576 | consumed samples: 104400 | elapsed time per iteration (ms): 15560.0 | learning rate: 2.891E-05 | global batch size: 48 | lm loss: 6.369026E+00 | loss scale: 32768.0 | grad norm: 295484.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4637/ 159576 | consumed samples: 104448 | elapsed time per iteration (ms): 15543.7 | learning rate: 2.893E-05 | global batch size: 48 | lm loss: 6.490546E+00 | loss scale: 32768.0 | grad norm: 279233.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4638/ 159576 | consumed samples: 104496 | elapsed time per iteration (ms): 15916.4 | learning rate: 2.894E-05 | global batch size: 48 | lm loss: 6.437621E+00 | loss scale: 32768.0 | grad norm: 245214.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4639/ 159576 | consumed samples: 104544 | elapsed time per iteration (ms): 15547.5 | learning rate: 2.895E-05 | global batch size: 48 | lm loss: 6.491655E+00 | loss scale: 32768.0 | grad norm: 240217.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4640/ 159576 | consumed samples: 104592 | elapsed time per iteration (ms): 15573.7 | learning rate: 2.897E-05 | global batch size: 48 | lm loss: 6.455505E+00 | loss scale: 32768.0 | grad norm: 317400.165 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4641/ 159576 | consumed samples: 104640 | elapsed time per iteration (ms): 15624.7 | learning rate: 2.898E-05 | global batch size: 48 | lm loss: 6.482111E+00 | loss scale: 32768.0 | grad norm: 244102.198 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4642/ 159576 | consumed samples: 104688 | elapsed time per iteration (ms): 16106.5 | learning rate: 2.899E-05 | global batch size: 48 | lm loss: 6.281504E+00 | loss scale: 32768.0 | grad norm: 282861.527 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4643/ 159576 | consumed samples: 104736 | elapsed time per iteration (ms): 15639.7 | learning rate: 2.901E-05 | global batch size: 48 | lm loss: 6.420715E+00 | loss scale: 32768.0 | grad norm: 274009.202 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4644/ 159576 | consumed samples: 104784 | elapsed time per iteration (ms): 15520.7 | learning rate: 2.902E-05 | global batch size: 48 | lm loss: 6.342989E+00 | loss scale: 32768.0 | grad norm: 226933.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4645/ 159576 | consumed samples: 104832 | elapsed time per iteration (ms): 15501.6 | learning rate: 2.903E-05 | global batch size: 48 | lm loss: 6.427937E+00 | loss scale: 32768.0 | grad norm: 278047.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4646/ 159576 | consumed samples: 104880 | elapsed time per iteration (ms): 15629.3 | learning rate: 2.905E-05 | global batch size: 48 | lm loss: 6.294481E+00 | loss scale: 32768.0 | grad norm: 235356.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4647/ 159576 | consumed samples: 104928 | elapsed time per iteration (ms): 15591.9 | learning rate: 2.906E-05 | global batch size: 48 | lm loss: 6.363388E+00 | loss scale: 32768.0 | grad norm: 600293.405 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4648/ 159576 | consumed samples: 104976 | elapsed time per iteration (ms): 15595.2 | learning rate: 2.907E-05 | global batch size: 48 | lm loss: 6.377505E+00 | loss scale: 32768.0 | grad norm: 331377.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4649/ 159576 | consumed samples: 105024 | elapsed time per iteration (ms): 15628.4 | learning rate: 2.909E-05 | global batch size: 48 | lm loss: 6.381812E+00 | loss scale: 32768.0 | grad norm: 200005.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4650/ 159576 | consumed samples: 105072 | elapsed time per iteration (ms): 15748.7 | learning rate: 2.910E-05 | global batch size: 48 | lm loss: 6.338908E+00 | loss scale: 32768.0 | grad norm: 242913.858 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4651/ 159576 | consumed samples: 105120 | elapsed time per iteration (ms): 15511.3 | learning rate: 2.911E-05 | global batch size: 48 | lm loss: 6.419736E+00 | loss scale: 32768.0 | grad norm: 330409.745 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4652/ 159576 | consumed samples: 105168 | elapsed time per iteration (ms): 15516.3 | learning rate: 2.913E-05 | global batch size: 48 | lm loss: 6.404620E+00 | loss scale: 32768.0 | grad norm: 318144.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4653/ 159576 | consumed samples: 105216 | elapsed time per iteration (ms): 15876.3 | learning rate: 2.914E-05 | global batch size: 48 | lm loss: 6.377990E+00 | loss scale: 32768.0 | grad norm: 232202.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4654/ 159576 | consumed samples: 105264 | elapsed time per iteration (ms): 15718.5 | learning rate: 2.915E-05 | global batch size: 48 | lm loss: 6.383665E+00 | loss scale: 32768.0 | grad norm: 241524.475 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4655/ 159576 | consumed samples: 105312 | elapsed time per iteration (ms): 15610.4 | learning rate: 2.917E-05 | global batch size: 48 | lm loss: 6.403493E+00 | loss scale: 32768.0 | grad norm: 373231.364 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4656/ 159576 | consumed samples: 105360 | elapsed time per iteration (ms): 15640.8 | learning rate: 2.918E-05 | global batch size: 48 | lm loss: 6.329133E+00 | loss scale: 32768.0 | grad norm: 286954.758 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4657/ 159576 | consumed samples: 105408 | elapsed time per iteration (ms): 15996.4 | learning rate: 2.919E-05 | global batch size: 48 | lm loss: 6.748344E+00 | loss scale: 32768.0 | grad norm: 260947.100 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4658/ 159576 | consumed samples: 105456 | elapsed time per iteration (ms): 15522.2 | learning rate: 2.921E-05 | global batch size: 48 | lm loss: 6.315388E+00 | loss scale: 32768.0 | grad norm: 279560.800 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4659/ 159576 | consumed samples: 105504 | elapsed time per iteration (ms): 15546.8 | learning rate: 2.922E-05 | global batch size: 48 | lm loss: 6.351707E+00 | loss scale: 32768.0 | grad norm: 270238.544 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4660/ 159576 | consumed samples: 105552 | elapsed time per iteration (ms): 15483.2 | learning rate: 2.923E-05 | global batch size: 48 | lm loss: 6.338678E+00 | loss scale: 32768.0 | grad norm: 299765.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4661/ 159576 | consumed samples: 105600 | elapsed time per iteration (ms): 15828.0 | learning rate: 2.925E-05 | global batch size: 48 | lm loss: 6.427124E+00 | loss scale: 32768.0 | grad norm: 302484.019 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4662/ 159576 | consumed samples: 105648 | elapsed time per iteration (ms): 15644.1 | learning rate: 2.926E-05 | global batch size: 48 | lm loss: 6.407690E+00 | loss scale: 32768.0 | grad norm: 286169.997 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4663/ 159576 | consumed samples: 105696 | elapsed time per iteration (ms): 15583.7 | learning rate: 2.927E-05 | global batch size: 48 | lm loss: 6.254132E+00 | loss scale: 32768.0 | grad norm: 276778.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4664/ 159576 | consumed samples: 105744 | elapsed time per iteration (ms): 15651.6 | learning rate: 2.929E-05 | global batch size: 48 | lm loss: 6.469905E+00 | loss scale: 32768.0 | grad norm: 279741.368 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4665/ 159576 | consumed samples: 105792 | elapsed time per iteration (ms): 15818.3 | learning rate: 2.930E-05 | global batch size: 48 | lm loss: 6.508596E+00 | loss scale: 32768.0 | grad norm: 336670.270 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4666/ 159576 | consumed samples: 105840 | elapsed time per iteration (ms): 15552.5 | learning rate: 2.931E-05 | global batch size: 48 | lm loss: 6.434944E+00 | loss scale: 32768.0 | grad norm: 242396.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4667/ 159576 | consumed samples: 105888 | elapsed time per iteration (ms): 15512.6 | learning rate: 2.933E-05 | global batch size: 48 | lm loss: 6.510550E+00 | loss scale: 32768.0 | grad norm: 252220.315 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4668/ 159576 | consumed samples: 105936 | elapsed time per iteration (ms): 15495.7 | learning rate: 2.934E-05 | global batch size: 48 | lm loss: 6.399008E+00 | loss scale: 32768.0 | grad norm: 288495.864 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4669/ 159576 | consumed samples: 105984 | elapsed time per iteration (ms): 15668.5 | learning rate: 2.935E-05 | global batch size: 48 | lm loss: 6.404999E+00 | loss scale: 32768.0 | grad norm: 244327.032 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4670/ 159576 | consumed samples: 106032 | elapsed time per iteration (ms): 15562.9 | learning rate: 2.937E-05 | global batch size: 48 | lm loss: 6.418772E+00 | loss scale: 32768.0 | grad norm: 313672.915 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4671/ 159576 | consumed samples: 106080 | elapsed time per iteration (ms): 15630.7 | learning rate: 2.938E-05 | global batch size: 48 | lm loss: 6.361070E+00 | loss scale: 32768.0 | grad norm: 276763.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4672/ 159576 | consumed samples: 106128 | elapsed time per iteration (ms): 15597.8 | learning rate: 2.939E-05 | global batch size: 48 | lm loss: 6.477580E+00 | loss scale: 32768.0 | grad norm: 230503.822 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4673/ 159576 | consumed samples: 106176 | elapsed time per iteration (ms): 15696.4 | learning rate: 2.941E-05 | global batch size: 48 | lm loss: 6.517149E+00 | loss scale: 32768.0 | grad norm: 217937.765 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4674/ 159576 | consumed samples: 106224 | elapsed time per iteration (ms): 15548.7 | learning rate: 2.942E-05 | global batch size: 48 | lm loss: 6.380251E+00 | loss scale: 32768.0 | grad norm: 267703.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4675/ 159576 | consumed samples: 106272 | elapsed time per iteration (ms): 15515.6 | learning rate: 2.943E-05 | global batch size: 48 | lm loss: 6.348250E+00 | loss scale: 32768.0 | grad norm: 309305.174 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4676/ 159576 | consumed samples: 106320 | elapsed time per iteration (ms): 15795.7 | learning rate: 2.945E-05 | global batch size: 48 | lm loss: 6.461040E+00 | loss scale: 32768.0 | grad norm: 285074.708 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4677/ 159576 | consumed samples: 106368 | elapsed time per iteration (ms): 15718.4 | learning rate: 2.946E-05 | global batch size: 48 | lm loss: 6.388801E+00 | loss scale: 32768.0 | grad norm: 292644.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4678/ 159576 | consumed samples: 106416 | elapsed time per iteration (ms): 15585.4 | learning rate: 2.947E-05 | global batch size: 48 | lm loss: 6.417225E+00 | loss scale: 32768.0 | grad norm: 334812.598 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4679/ 159576 | consumed samples: 106464 | elapsed time per iteration (ms): 15631.1 | learning rate: 2.949E-05 | global batch size: 48 | lm loss: 6.357790E+00 | loss scale: 32768.0 | grad norm: 301017.925 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4680/ 159576 | consumed samples: 106512 | elapsed time per iteration (ms): 15891.7 | learning rate: 2.950E-05 | global batch size: 48 | lm loss: 6.556364E+00 | loss scale: 32768.0 | grad norm: 280065.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4681/ 159576 | consumed samples: 106560 | elapsed time per iteration (ms): 15562.2 | learning rate: 2.951E-05 | global batch size: 48 | lm loss: 6.393982E+00 | loss scale: 32768.0 | grad norm: 242731.164 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4682/ 159576 | consumed samples: 106608 | elapsed time per iteration (ms): 15526.5 | learning rate: 2.953E-05 | global batch size: 48 | lm loss: 6.396220E+00 | loss scale: 32768.0 | grad norm: 407344.753 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4683/ 159576 | consumed samples: 106656 | elapsed time per iteration (ms): 15526.3 | learning rate: 2.954E-05 | global batch size: 48 | lm loss: 6.396249E+00 | loss scale: 32768.0 | grad norm: 300342.299 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4684/ 159576 | consumed samples: 106704 | elapsed time per iteration (ms): 15885.4 | learning rate: 2.955E-05 | global batch size: 48 | lm loss: 6.375283E+00 | loss scale: 32768.0 | grad norm: 296501.436 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4685/ 159576 | consumed samples: 106752 | elapsed time per iteration (ms): 15527.4 | learning rate: 2.957E-05 | global batch size: 48 | lm loss: 6.418046E+00 | loss scale: 32768.0 | grad norm: 290100.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4686/ 159576 | consumed samples: 106800 | elapsed time per iteration (ms): 15621.1 | learning rate: 2.958E-05 | global batch size: 48 | lm loss: 6.300463E+00 | loss scale: 32768.0 | grad norm: 265814.471 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4687/ 159576 | consumed samples: 106848 | elapsed time per iteration (ms): 15592.0 | learning rate: 2.959E-05 | global batch size: 48 | lm loss: 6.440179E+00 | loss scale: 32768.0 | grad norm: 354690.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4688/ 159576 | consumed samples: 106896 | elapsed time per iteration (ms): 15963.5 | learning rate: 2.961E-05 | global batch size: 48 | lm loss: 6.396194E+00 | loss scale: 32768.0 | grad norm: 259594.010 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4689/ 159576 | consumed samples: 106944 | elapsed time per iteration (ms): 15540.2 | learning rate: 2.962E-05 | global batch size: 48 | lm loss: 6.459390E+00 | loss scale: 32768.0 | grad norm: 326661.756 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4690/ 159576 | consumed samples: 106992 | elapsed time per iteration (ms): 15512.7 | learning rate: 2.963E-05 | global batch size: 48 | lm loss: 6.324084E+00 | loss scale: 32768.0 | grad norm: 288829.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4691/ 159576 | consumed samples: 107040 | elapsed time per iteration (ms): 8709.6 | learning rate: 2.963E-05 | global batch size: 48 | lm loss: 6.781525E+00 | loss scale: 16384.0 | grad norm: 288829.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4692/ 159576 | consumed samples: 107088 | elapsed time per iteration (ms): 15305.7 | learning rate: 2.964E-05 | global batch size: 48 | lm loss: 6.431325E+00 | loss scale: 16384.0 | grad norm: 145022.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4693/ 159576 | consumed samples: 107136 | elapsed time per iteration (ms): 15550.9 | learning rate: 2.966E-05 | global batch size: 48 | lm loss: 6.516616E+00 | loss scale: 16384.0 | grad norm: 155613.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4694/ 159576 | consumed samples: 107184 | elapsed time per iteration (ms): 15526.9 | learning rate: 2.967E-05 | global batch size: 48 | lm loss: 6.387960E+00 | loss scale: 16384.0 | grad norm: 134461.471 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4695/ 159576 | consumed samples: 107232 | elapsed time per iteration (ms): 15497.0 | learning rate: 2.968E-05 | global batch size: 48 | lm loss: 6.392653E+00 | loss scale: 16384.0 | grad norm: 141822.076 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4696/ 159576 | consumed samples: 107280 | elapsed time per iteration (ms): 15923.9 | learning rate: 2.970E-05 | global batch size: 48 | lm loss: 6.412030E+00 | loss scale: 16384.0 | grad norm: 175057.651 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4697/ 159576 | consumed samples: 107328 | elapsed time per iteration (ms): 15425.2 | learning rate: 2.971E-05 | global batch size: 48 | lm loss: 6.373864E+00 | loss scale: 16384.0 | grad norm: 282779.549 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4698/ 159576 | consumed samples: 107376 | elapsed time per iteration (ms): 15454.6 | learning rate: 2.972E-05 | global batch size: 48 | lm loss: 6.306759E+00 | loss scale: 16384.0 | grad norm: 136700.298 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4699/ 159576 | consumed samples: 107424 | elapsed time per iteration (ms): 15528.9 | learning rate: 2.974E-05 | global batch size: 48 | lm loss: 6.335629E+00 | loss scale: 16384.0 | grad norm: 184501.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4700/ 159576 | consumed samples: 107472 | elapsed time per iteration (ms): 15956.8 | learning rate: 2.975E-05 | global batch size: 48 | lm loss: 6.408161E+00 | loss scale: 16384.0 | grad norm: 173148.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4701/ 159576 | consumed samples: 107520 | elapsed time per iteration (ms): 15601.2 | learning rate: 2.976E-05 | global batch size: 48 | lm loss: 6.452803E+00 | loss scale: 16384.0 | grad norm: 175212.053 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4702/ 159576 | consumed samples: 107568 | elapsed time per iteration (ms): 15499.9 | learning rate: 2.978E-05 | global batch size: 48 | lm loss: 6.444376E+00 | loss scale: 16384.0 | grad norm: 154484.468 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4703/ 159576 | consumed samples: 107616 | elapsed time per iteration (ms): 15505.8 | learning rate: 2.979E-05 | global batch size: 48 | lm loss: 6.378032E+00 | loss scale: 16384.0 | grad norm: 157853.641 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4704/ 159576 | consumed samples: 107664 | elapsed time per iteration (ms): 15797.2 | learning rate: 2.980E-05 | global batch size: 48 | lm loss: 6.433157E+00 | loss scale: 16384.0 | grad norm: 189038.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4705/ 159576 | consumed samples: 107712 | elapsed time per iteration (ms): 15428.0 | learning rate: 2.982E-05 | global batch size: 48 | lm loss: 6.345381E+00 | loss scale: 16384.0 | grad norm: 223066.594 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4706/ 159576 | consumed samples: 107760 | elapsed time per iteration (ms): 15506.2 | learning rate: 2.983E-05 | global batch size: 48 | lm loss: 6.409193E+00 | loss scale: 16384.0 | grad norm: 138366.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4707/ 159576 | consumed samples: 107808 | elapsed time per iteration (ms): 15469.9 | learning rate: 2.984E-05 | global batch size: 48 | lm loss: 6.454758E+00 | loss scale: 16384.0 | grad norm: 144072.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4708/ 159576 | consumed samples: 107856 | elapsed time per iteration (ms): 15711.5 | learning rate: 2.986E-05 | global batch size: 48 | lm loss: 6.418115E+00 | loss scale: 16384.0 | grad norm: 160060.361 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4709/ 159576 | consumed samples: 107904 | elapsed time per iteration (ms): 15549.5 | learning rate: 2.987E-05 | global batch size: 48 | lm loss: 6.323099E+00 | loss scale: 16384.0 | grad norm: 158794.827 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4710/ 159576 | consumed samples: 107952 | elapsed time per iteration (ms): 15458.0 | learning rate: 2.988E-05 | global batch size: 48 | lm loss: 6.418284E+00 | loss scale: 16384.0 | grad norm: 172985.051 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4711/ 159576 | consumed samples: 108000 | elapsed time per iteration (ms): 15477.2 | learning rate: 2.990E-05 | global batch size: 48 | lm loss: 6.449984E+00 | loss scale: 16384.0 | grad norm: 151942.015 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4712/ 159576 | consumed samples: 108048 | elapsed time per iteration (ms): 15912.6 | learning rate: 2.991E-05 | global batch size: 48 | lm loss: 6.331490E+00 | loss scale: 16384.0 | grad norm: 148710.284 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4713/ 159576 | consumed samples: 108096 | elapsed time per iteration (ms): 15440.5 | learning rate: 2.992E-05 | global batch size: 48 | lm loss: 6.445600E+00 | loss scale: 16384.0 | grad norm: 136119.725 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4714/ 159576 | consumed samples: 108144 | elapsed time per iteration (ms): 15519.8 | learning rate: 2.994E-05 | global batch size: 48 | lm loss: 6.276518E+00 | loss scale: 16384.0 | grad norm: 170811.199 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4715/ 159576 | consumed samples: 108192 | elapsed time per iteration (ms): 15866.2 | learning rate: 2.995E-05 | global batch size: 48 | lm loss: 6.430917E+00 | loss scale: 16384.0 | grad norm: 145058.329 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4716/ 159576 | consumed samples: 108240 | elapsed time per iteration (ms): 15520.8 | learning rate: 2.996E-05 | global batch size: 48 | lm loss: 6.459754E+00 | loss scale: 16384.0 | grad norm: 146862.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4717/ 159576 | consumed samples: 108288 | elapsed time per iteration (ms): 15578.0 | learning rate: 2.998E-05 | global batch size: 48 | lm loss: 6.447017E+00 | loss scale: 16384.0 | grad norm: 172505.739 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4718/ 159576 | consumed samples: 108336 | elapsed time per iteration (ms): 15434.8 | learning rate: 2.999E-05 | global batch size: 48 | lm loss: 6.316633E+00 | loss scale: 16384.0 | grad norm: 130149.169 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4719/ 159576 | consumed samples: 108384 | elapsed time per iteration (ms): 15703.7 | learning rate: 3.000E-05 | global batch size: 48 | lm loss: 6.376626E+00 | loss scale: 16384.0 | grad norm: 198273.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4720/ 159576 | consumed samples: 108432 | elapsed time per iteration (ms): 15522.7 | learning rate: 3.002E-05 | global batch size: 48 | lm loss: 6.340569E+00 | loss scale: 16384.0 | grad norm: 189583.946 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4721/ 159576 | consumed samples: 108480 | elapsed time per iteration (ms): 15419.9 | learning rate: 3.003E-05 | global batch size: 48 | lm loss: 6.519832E+00 | loss scale: 16384.0 | grad norm: 148280.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4722/ 159576 | consumed samples: 108528 | elapsed time per iteration (ms): 15537.6 | learning rate: 3.004E-05 | global batch size: 48 | lm loss: 6.519564E+00 | loss scale: 16384.0 | grad norm: 165136.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4723/ 159576 | consumed samples: 108576 | elapsed time per iteration (ms): 15984.2 | learning rate: 3.006E-05 | global batch size: 48 | lm loss: 6.331813E+00 | loss scale: 16384.0 | grad norm: 137134.914 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4724/ 159576 | consumed samples: 108624 | elapsed time per iteration (ms): 15591.8 | learning rate: 3.007E-05 | global batch size: 48 | lm loss: 6.417581E+00 | loss scale: 16384.0 | grad norm: 135525.990 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4725/ 159576 | consumed samples: 108672 | elapsed time per iteration (ms): 15458.7 | learning rate: 3.008E-05 | global batch size: 48 | lm loss: 6.369280E+00 | loss scale: 16384.0 | grad norm: 135730.698 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4726/ 159576 | consumed samples: 108720 | elapsed time per iteration (ms): 15476.9 | learning rate: 3.010E-05 | global batch size: 48 | lm loss: 6.320598E+00 | loss scale: 16384.0 | grad norm: 147233.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4727/ 159576 | consumed samples: 108768 | elapsed time per iteration (ms): 15812.7 | learning rate: 3.011E-05 | global batch size: 48 | lm loss: 6.469586E+00 | loss scale: 16384.0 | grad norm: 164519.317 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4728/ 159576 | consumed samples: 108816 | elapsed time per iteration (ms): 15490.9 | learning rate: 3.012E-05 | global batch size: 48 | lm loss: 6.473386E+00 | loss scale: 16384.0 | grad norm: 151619.547 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4729/ 159576 | consumed samples: 108864 | elapsed time per iteration (ms): 15470.7 | learning rate: 3.014E-05 | global batch size: 48 | lm loss: 6.340328E+00 | loss scale: 16384.0 | grad norm: 137036.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4730/ 159576 | consumed samples: 108912 | elapsed time per iteration (ms): 15531.2 | learning rate: 3.015E-05 | global batch size: 48 | lm loss: 6.394744E+00 | loss scale: 16384.0 | grad norm: 146186.033 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4731/ 159576 | consumed samples: 108960 | elapsed time per iteration (ms): 15606.4 | learning rate: 3.016E-05 | global batch size: 48 | lm loss: 6.362489E+00 | loss scale: 16384.0 | grad norm: 187444.936 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4732/ 159576 | consumed samples: 109008 | elapsed time per iteration (ms): 15504.3 | learning rate: 3.018E-05 | global batch size: 48 | lm loss: 6.456880E+00 | loss scale: 16384.0 | grad norm: 129595.559 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4733/ 159576 | consumed samples: 109056 | elapsed time per iteration (ms): 15474.7 | learning rate: 3.019E-05 | global batch size: 48 | lm loss: 6.443705E+00 | loss scale: 16384.0 | grad norm: 137176.536 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4734/ 159576 | consumed samples: 109104 | elapsed time per iteration (ms): 15468.7 | learning rate: 3.020E-05 | global batch size: 48 | lm loss: 6.325924E+00 | loss scale: 16384.0 | grad norm: 130886.931 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4735/ 159576 | consumed samples: 109152 | elapsed time per iteration (ms): 15622.9 | learning rate: 3.022E-05 | global batch size: 48 | lm loss: 6.367020E+00 | loss scale: 16384.0 | grad norm: 133365.928 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4736/ 159576 | consumed samples: 109200 | elapsed time per iteration (ms): 15496.0 | learning rate: 3.023E-05 | global batch size: 48 | lm loss: 6.366150E+00 | loss scale: 16384.0 | grad norm: 170880.695 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4737/ 159576 | consumed samples: 109248 | elapsed time per iteration (ms): 15489.1 | learning rate: 3.024E-05 | global batch size: 48 | lm loss: 6.352594E+00 | loss scale: 16384.0 | grad norm: 126383.624 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4738/ 159576 | consumed samples: 109296 | elapsed time per iteration (ms): 15753.5 | learning rate: 3.026E-05 | global batch size: 48 | lm loss: 6.439698E+00 | loss scale: 16384.0 | grad norm: 178764.163 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4739/ 159576 | consumed samples: 109344 | elapsed time per iteration (ms): 15669.9 | learning rate: 3.027E-05 | global batch size: 48 | lm loss: 6.379218E+00 | loss scale: 16384.0 | grad norm: 140248.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4740/ 159576 | consumed samples: 109392 | elapsed time per iteration (ms): 15472.2 | learning rate: 3.028E-05 | global batch size: 48 | lm loss: 6.455700E+00 | loss scale: 16384.0 | grad norm: 141297.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4741/ 159576 | consumed samples: 109440 | elapsed time per iteration (ms): 15470.3 | learning rate: 3.030E-05 | global batch size: 48 | lm loss: 6.395582E+00 | loss scale: 16384.0 | grad norm: 132933.676 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4742/ 159576 | consumed samples: 109488 | elapsed time per iteration (ms): 15846.4 | learning rate: 3.031E-05 | global batch size: 48 | lm loss: 6.391361E+00 | loss scale: 16384.0 | grad norm: 118703.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4743/ 159576 | consumed samples: 109536 | elapsed time per iteration (ms): 15513.5 | learning rate: 3.032E-05 | global batch size: 48 | lm loss: 6.428627E+00 | loss scale: 16384.0 | grad norm: 138048.574 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4744/ 159576 | consumed samples: 109584 | elapsed time per iteration (ms): 15514.2 | learning rate: 3.034E-05 | global batch size: 48 | lm loss: 6.294309E+00 | loss scale: 16384.0 | grad norm: 140003.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4745/ 159576 | consumed samples: 109632 | elapsed time per iteration (ms): 15479.8 | learning rate: 3.035E-05 | global batch size: 48 | lm loss: 6.442544E+00 | loss scale: 16384.0 | grad norm: 137520.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4746/ 159576 | consumed samples: 109680 | elapsed time per iteration (ms): 15909.9 | learning rate: 3.036E-05 | global batch size: 48 | lm loss: 6.330937E+00 | loss scale: 16384.0 | grad norm: 133869.361 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4747/ 159576 | consumed samples: 109728 | elapsed time per iteration (ms): 15438.5 | learning rate: 3.038E-05 | global batch size: 48 | lm loss: 6.375879E+00 | loss scale: 16384.0 | grad norm: 186074.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4748/ 159576 | consumed samples: 109776 | elapsed time per iteration (ms): 15478.1 | learning rate: 3.039E-05 | global batch size: 48 | lm loss: 6.291435E+00 | loss scale: 16384.0 | grad norm: 133042.212 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4749/ 159576 | consumed samples: 109824 | elapsed time per iteration (ms): 15511.0 | learning rate: 3.040E-05 | global batch size: 48 | lm loss: 6.392264E+00 | loss scale: 16384.0 | grad norm: 142954.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4750/ 159576 | consumed samples: 109872 | elapsed time per iteration (ms): 15876.7 | learning rate: 3.042E-05 | global batch size: 48 | lm loss: 7.872174E+00 | loss scale: 16384.0 | grad norm: 409825.671 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4751/ 159576 | consumed samples: 109920 | elapsed time per iteration (ms): 15539.2 | learning rate: 3.043E-05 | global batch size: 48 | lm loss: 6.478594E+00 | loss scale: 16384.0 | grad norm: 125638.703 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4752/ 159576 | consumed samples: 109968 | elapsed time per iteration (ms): 15507.7 | learning rate: 3.044E-05 | global batch size: 48 | lm loss: 6.357571E+00 | loss scale: 16384.0 | grad norm: 108403.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4753/ 159576 | consumed samples: 110016 | elapsed time per iteration (ms): 15485.4 | learning rate: 3.046E-05 | global batch size: 48 | lm loss: 6.517112E+00 | loss scale: 16384.0 | grad norm: 101971.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4754/ 159576 | consumed samples: 110064 | elapsed time per iteration (ms): 15669.7 | learning rate: 3.047E-05 | global batch size: 48 | lm loss: 6.311660E+00 | loss scale: 16384.0 | grad norm: 117424.161 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4755/ 159576 | consumed samples: 110112 | elapsed time per iteration (ms): 15529.0 | learning rate: 3.048E-05 | global batch size: 48 | lm loss: 6.452873E+00 | loss scale: 16384.0 | grad norm: 153333.779 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4756/ 159576 | consumed samples: 110160 | elapsed time per iteration (ms): 15556.8 | learning rate: 3.050E-05 | global batch size: 48 | lm loss: 6.470776E+00 | loss scale: 16384.0 | grad norm: 123606.469 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4757/ 159576 | consumed samples: 110208 | elapsed time per iteration (ms): 15535.1 | learning rate: 3.051E-05 | global batch size: 48 | lm loss: 6.444992E+00 | loss scale: 16384.0 | grad norm: 103337.864 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4758/ 159576 | consumed samples: 110256 | elapsed time per iteration (ms): 15670.4 | learning rate: 3.052E-05 | global batch size: 48 | lm loss: 6.402925E+00 | loss scale: 16384.0 | grad norm: 145142.298 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4759/ 159576 | consumed samples: 110304 | elapsed time per iteration (ms): 15615.8 | learning rate: 3.054E-05 | global batch size: 48 | lm loss: 6.383159E+00 | loss scale: 16384.0 | grad norm: 115666.450 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4760/ 159576 | consumed samples: 110352 | elapsed time per iteration (ms): 15593.7 | learning rate: 3.055E-05 | global batch size: 48 | lm loss: 6.288662E+00 | loss scale: 16384.0 | grad norm: 125590.923 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4761/ 159576 | consumed samples: 110400 | elapsed time per iteration (ms): 15582.7 | learning rate: 3.056E-05 | global batch size: 48 | lm loss: 6.460382E+00 | loss scale: 16384.0 | grad norm: 131535.871 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4762/ 159576 | consumed samples: 110448 | elapsed time per iteration (ms): 15777.3 | learning rate: 3.058E-05 | global batch size: 48 | lm loss: 6.421331E+00 | loss scale: 16384.0 | grad norm: 123507.404 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4763/ 159576 | consumed samples: 110496 | elapsed time per iteration (ms): 15542.1 | learning rate: 3.059E-05 | global batch size: 48 | lm loss: 6.471745E+00 | loss scale: 16384.0 | grad norm: 142533.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4764/ 159576 | consumed samples: 110544 | elapsed time per iteration (ms): 15505.7 | learning rate: 3.060E-05 | global batch size: 48 | lm loss: 6.437591E+00 | loss scale: 16384.0 | grad norm: 150206.216 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4765/ 159576 | consumed samples: 110592 | elapsed time per iteration (ms): 15784.9 | learning rate: 3.062E-05 | global batch size: 48 | lm loss: 6.426904E+00 | loss scale: 16384.0 | grad norm: 117533.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4766/ 159576 | consumed samples: 110640 | elapsed time per iteration (ms): 15571.9 | learning rate: 3.063E-05 | global batch size: 48 | lm loss: 6.361554E+00 | loss scale: 16384.0 | grad norm: 125319.029 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4767/ 159576 | consumed samples: 110688 | elapsed time per iteration (ms): 15502.5 | learning rate: 3.064E-05 | global batch size: 48 | lm loss: 6.404096E+00 | loss scale: 16384.0 | grad norm: 137718.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4768/ 159576 | consumed samples: 110736 | elapsed time per iteration (ms): 15543.8 | learning rate: 3.066E-05 | global batch size: 48 | lm loss: 6.437445E+00 | loss scale: 16384.0 | grad norm: 138623.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4769/ 159576 | consumed samples: 110784 | elapsed time per iteration (ms): 15859.0 | learning rate: 3.067E-05 | global batch size: 48 | lm loss: 6.395863E+00 | loss scale: 16384.0 | grad norm: 127878.926 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4770/ 159576 | consumed samples: 110832 | elapsed time per iteration (ms): 15536.9 | learning rate: 3.068E-05 | global batch size: 48 | lm loss: 6.561028E+00 | loss scale: 16384.0 | grad norm: 124917.908 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4771/ 159576 | consumed samples: 110880 | elapsed time per iteration (ms): 15506.9 | learning rate: 3.070E-05 | global batch size: 48 | lm loss: 6.471921E+00 | loss scale: 16384.0 | grad norm: 161855.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4772/ 159576 | consumed samples: 110928 | elapsed time per iteration (ms): 15469.5 | learning rate: 3.071E-05 | global batch size: 48 | lm loss: 6.442107E+00 | loss scale: 16384.0 | grad norm: 174619.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4773/ 159576 | consumed samples: 110976 | elapsed time per iteration (ms): 15874.3 | learning rate: 3.072E-05 | global batch size: 48 | lm loss: 6.450697E+00 | loss scale: 16384.0 | grad norm: 128857.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4774/ 159576 | consumed samples: 111024 | elapsed time per iteration (ms): 15476.2 | learning rate: 3.074E-05 | global batch size: 48 | lm loss: 6.409184E+00 | loss scale: 16384.0 | grad norm: 167963.478 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4775/ 159576 | consumed samples: 111072 | elapsed time per iteration (ms): 15524.6 | learning rate: 3.075E-05 | global batch size: 48 | lm loss: 6.521546E+00 | loss scale: 16384.0 | grad norm: 160789.278 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4776/ 159576 | consumed samples: 111120 | elapsed time per iteration (ms): 15522.1 | learning rate: 3.076E-05 | global batch size: 48 | lm loss: 6.392659E+00 | loss scale: 16384.0 | grad norm: 144341.782 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4777/ 159576 | consumed samples: 111168 | elapsed time per iteration (ms): 15807.4 | learning rate: 3.078E-05 | global batch size: 48 | lm loss: 6.295141E+00 | loss scale: 16384.0 | grad norm: 127243.790 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4778/ 159576 | consumed samples: 111216 | elapsed time per iteration (ms): 15569.3 | learning rate: 3.079E-05 | global batch size: 48 | lm loss: 6.327214E+00 | loss scale: 16384.0 | grad norm: 126284.160 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4779/ 159576 | consumed samples: 111264 | elapsed time per iteration (ms): 15403.5 | learning rate: 3.080E-05 | global batch size: 48 | lm loss: 6.573749E+00 | loss scale: 16384.0 | grad norm: 122918.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4780/ 159576 | consumed samples: 111312 | elapsed time per iteration (ms): 15381.1 | learning rate: 3.082E-05 | global batch size: 48 | lm loss: 6.433424E+00 | loss scale: 16384.0 | grad norm: 124694.541 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4781/ 159576 | consumed samples: 111360 | elapsed time per iteration (ms): 15664.5 | learning rate: 3.083E-05 | global batch size: 48 | lm loss: 6.469074E+00 | loss scale: 16384.0 | grad norm: 147526.104 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4782/ 159576 | consumed samples: 111408 | elapsed time per iteration (ms): 15406.6 | learning rate: 3.084E-05 | global batch size: 48 | lm loss: 6.349575E+00 | loss scale: 16384.0 | grad norm: 124417.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4783/ 159576 | consumed samples: 111456 | elapsed time per iteration (ms): 15497.8 | learning rate: 3.086E-05 | global batch size: 48 | lm loss: 6.254411E+00 | loss scale: 16384.0 | grad norm: 132978.536 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4784/ 159576 | consumed samples: 111504 | elapsed time per iteration (ms): 15491.3 | learning rate: 3.087E-05 | global batch size: 48 | lm loss: 6.407672E+00 | loss scale: 16384.0 | grad norm: 136226.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4785/ 159576 | consumed samples: 111552 | elapsed time per iteration (ms): 15585.7 | learning rate: 3.088E-05 | global batch size: 48 | lm loss: 6.340271E+00 | loss scale: 16384.0 | grad norm: 143403.987 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4786/ 159576 | consumed samples: 111600 | elapsed time per iteration (ms): 15420.5 | learning rate: 3.090E-05 | global batch size: 48 | lm loss: 6.532565E+00 | loss scale: 16384.0 | grad norm: 146250.150 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4787/ 159576 | consumed samples: 111648 | elapsed time per iteration (ms): 15350.6 | learning rate: 3.091E-05 | global batch size: 48 | lm loss: 6.387796E+00 | loss scale: 16384.0 | grad norm: 219665.169 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 21:07:15] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 21:07:15] PULSE: tr8-104B is running for 15:15:04 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 4788/ 159576 | consumed samples: 111696 | elapsed time per iteration (ms): 15408.2 | learning rate: 3.092E-05 | global batch size: 48 | lm loss: 6.385682E+00 | loss scale: 16384.0 | grad norm: 135205.771 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4789/ 159576 | consumed samples: 111744 | elapsed time per iteration (ms): 15723.0 | learning rate: 3.094E-05 | global batch size: 48 | lm loss: 6.382418E+00 | loss scale: 16384.0 | grad norm: 135775.375 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4790/ 159576 | consumed samples: 111792 | elapsed time per iteration (ms): 15412.3 | learning rate: 3.095E-05 | global batch size: 48 | lm loss: 6.349115E+00 | loss scale: 16384.0 | grad norm: 161890.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4791/ 159576 | consumed samples: 111840 | elapsed time per iteration (ms): 15444.3 | learning rate: 3.096E-05 | global batch size: 48 | lm loss: 6.551302E+00 | loss scale: 16384.0 | grad norm: 160659.721 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4792/ 159576 | consumed samples: 111888 | elapsed time per iteration (ms): 15819.0 | learning rate: 3.098E-05 | global batch size: 48 | lm loss: 6.439594E+00 | loss scale: 16384.0 | grad norm: 133779.922 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4793/ 159576 | consumed samples: 111936 | elapsed time per iteration (ms): 15566.2 | learning rate: 3.099E-05 | global batch size: 48 | lm loss: 6.469571E+00 | loss scale: 16384.0 | grad norm: 134021.262 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4794/ 159576 | consumed samples: 111984 | elapsed time per iteration (ms): 15417.1 | learning rate: 3.100E-05 | global batch size: 48 | lm loss: 6.302731E+00 | loss scale: 16384.0 | grad norm: 144273.145 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4795/ 159576 | consumed samples: 112032 | elapsed time per iteration (ms): 15348.6 | learning rate: 3.102E-05 | global batch size: 48 | lm loss: 6.524598E+00 | loss scale: 16384.0 | grad norm: 173531.750 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4796/ 159576 | consumed samples: 112080 | elapsed time per iteration (ms): 15687.5 | learning rate: 3.103E-05 | global batch size: 48 | lm loss: 6.379292E+00 | loss scale: 16384.0 | grad norm: 135799.927 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4797/ 159576 | consumed samples: 112128 | elapsed time per iteration (ms): 15525.4 | learning rate: 3.104E-05 | global batch size: 48 | lm loss: 6.363866E+00 | loss scale: 16384.0 | grad norm: 157197.319 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4798/ 159576 | consumed samples: 112176 | elapsed time per iteration (ms): 15407.8 | learning rate: 3.106E-05 | global batch size: 48 | lm loss: 6.301018E+00 | loss scale: 16384.0 | grad norm: 157927.560 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4799/ 159576 | consumed samples: 112224 | elapsed time per iteration (ms): 15420.4 | learning rate: 3.107E-05 | global batch size: 48 | lm loss: 6.529522E+00 | loss scale: 16384.0 | grad norm: 161359.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4800/ 159576 | consumed samples: 112272 | elapsed time per iteration (ms): 15797.9 | learning rate: 3.108E-05 | global batch size: 48 | lm loss: 6.347914E+00 | loss scale: 16384.0 | grad norm: 147972.460 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4801/ 159576 | consumed samples: 112320 | elapsed time per iteration (ms): 15327.2 | learning rate: 3.110E-05 | global batch size: 48 | lm loss: 6.375738E+00 | loss scale: 16384.0 | grad norm: 153820.838 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4802/ 159576 | consumed samples: 112368 | elapsed time per iteration (ms): 15430.2 | learning rate: 3.111E-05 | global batch size: 48 | lm loss: 6.380699E+00 | loss scale: 16384.0 | grad norm: 200141.688 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4803/ 159576 | consumed samples: 112416 | elapsed time per iteration (ms): 15437.0 | learning rate: 3.112E-05 | global batch size: 48 | lm loss: 6.346474E+00 | loss scale: 16384.0 | grad norm: 150956.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4804/ 159576 | consumed samples: 112464 | elapsed time per iteration (ms): 15932.7 | learning rate: 3.114E-05 | global batch size: 48 | lm loss: 6.424392E+00 | loss scale: 16384.0 | grad norm: 144387.858 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4805/ 159576 | consumed samples: 112512 | elapsed time per iteration (ms): 15535.0 | learning rate: 3.115E-05 | global batch size: 48 | lm loss: 6.327216E+00 | loss scale: 16384.0 | grad norm: 145981.007 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4806/ 159576 | consumed samples: 112560 | elapsed time per iteration (ms): 15433.8 | learning rate: 3.116E-05 | global batch size: 48 | lm loss: 6.352614E+00 | loss scale: 16384.0 | grad norm: 159012.654 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4807/ 159576 | consumed samples: 112608 | elapsed time per iteration (ms): 15389.4 | learning rate: 3.118E-05 | global batch size: 48 | lm loss: 6.523698E+00 | loss scale: 16384.0 | grad norm: 183142.813 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4808/ 159576 | consumed samples: 112656 | elapsed time per iteration (ms): 15811.1 | learning rate: 3.119E-05 | global batch size: 48 | lm loss: 6.425416E+00 | loss scale: 16384.0 | grad norm: 158356.721 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4809/ 159576 | consumed samples: 112704 | elapsed time per iteration (ms): 15390.9 | learning rate: 3.120E-05 | global batch size: 48 | lm loss: 6.460537E+00 | loss scale: 16384.0 | grad norm: 160752.580 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4810/ 159576 | consumed samples: 112752 | elapsed time per iteration (ms): 15403.0 | learning rate: 3.122E-05 | global batch size: 48 | lm loss: 6.358703E+00 | loss scale: 16384.0 | grad norm: 136445.446 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4811/ 159576 | consumed samples: 112800 | elapsed time per iteration (ms): 15361.3 | learning rate: 3.123E-05 | global batch size: 48 | lm loss: 6.445686E+00 | loss scale: 16384.0 | grad norm: 150287.223 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4812/ 159576 | consumed samples: 112848 | elapsed time per iteration (ms): 15635.2 | learning rate: 3.124E-05 | global batch size: 48 | lm loss: 6.351339E+00 | loss scale: 16384.0 | grad norm: 127746.325 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4813/ 159576 | consumed samples: 112896 | elapsed time per iteration (ms): 15458.8 | learning rate: 3.126E-05 | global batch size: 48 | lm loss: 6.509888E+00 | loss scale: 16384.0 | grad norm: 142135.548 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4814/ 159576 | consumed samples: 112944 | elapsed time per iteration (ms): 15373.2 | learning rate: 3.127E-05 | global batch size: 48 | lm loss: 6.393768E+00 | loss scale: 16384.0 | grad norm: 140003.150 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4815/ 159576 | consumed samples: 112992 | elapsed time per iteration (ms): 15438.1 | learning rate: 3.128E-05 | global batch size: 48 | lm loss: 6.501161E+00 | loss scale: 16384.0 | grad norm: 148857.005 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4816/ 159576 | consumed samples: 113040 | elapsed time per iteration (ms): 15632.8 | learning rate: 3.130E-05 | global batch size: 48 | lm loss: 6.330061E+00 | loss scale: 16384.0 | grad norm: 147693.703 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4817/ 159576 | consumed samples: 113088 | elapsed time per iteration (ms): 15360.6 | learning rate: 3.131E-05 | global batch size: 48 | lm loss: 6.405270E+00 | loss scale: 16384.0 | grad norm: 135039.455 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4818/ 159576 | consumed samples: 113136 | elapsed time per iteration (ms): 15427.5 | learning rate: 3.132E-05 | global batch size: 48 | lm loss: 6.376327E+00 | loss scale: 16384.0 | grad norm: 144860.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4819/ 159576 | consumed samples: 113184 | elapsed time per iteration (ms): 15402.3 | learning rate: 3.134E-05 | global batch size: 48 | lm loss: 6.422782E+00 | loss scale: 16384.0 | grad norm: 185430.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4820/ 159576 | consumed samples: 113232 | elapsed time per iteration (ms): 15872.7 | learning rate: 3.135E-05 | global batch size: 48 | lm loss: 6.447948E+00 | loss scale: 16384.0 | grad norm: 143563.779 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4821/ 159576 | consumed samples: 113280 | elapsed time per iteration (ms): 15475.0 | learning rate: 3.136E-05 | global batch size: 48 | lm loss: 6.419926E+00 | loss scale: 16384.0 | grad norm: 139618.440 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4822/ 159576 | consumed samples: 113328 | elapsed time per iteration (ms): 15479.8 | learning rate: 3.138E-05 | global batch size: 48 | lm loss: 6.307784E+00 | loss scale: 16384.0 | grad norm: 135923.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4823/ 159576 | consumed samples: 113376 | elapsed time per iteration (ms): 15830.9 | learning rate: 3.139E-05 | global batch size: 48 | lm loss: 6.485186E+00 | loss scale: 16384.0 | grad norm: 148878.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4824/ 159576 | consumed samples: 113424 | elapsed time per iteration (ms): 15412.5 | learning rate: 3.140E-05 | global batch size: 48 | lm loss: 6.344635E+00 | loss scale: 16384.0 | grad norm: 144634.532 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4825/ 159576 | consumed samples: 113472 | elapsed time per iteration (ms): 15399.2 | learning rate: 3.142E-05 | global batch size: 48 | lm loss: 6.380017E+00 | loss scale: 16384.0 | grad norm: 149087.377 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4826/ 159576 | consumed samples: 113520 | elapsed time per iteration (ms): 15495.5 | learning rate: 3.143E-05 | global batch size: 48 | lm loss: 6.478100E+00 | loss scale: 16384.0 | grad norm: 157916.270 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4827/ 159576 | consumed samples: 113568 | elapsed time per iteration (ms): 15748.7 | learning rate: 3.144E-05 | global batch size: 48 | lm loss: 6.353170E+00 | loss scale: 16384.0 | grad norm: 130626.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4828/ 159576 | consumed samples: 113616 | elapsed time per iteration (ms): 15356.7 | learning rate: 3.146E-05 | global batch size: 48 | lm loss: 6.307143E+00 | loss scale: 16384.0 | grad norm: 152222.347 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4829/ 159576 | consumed samples: 113664 | elapsed time per iteration (ms): 15426.2 | learning rate: 3.147E-05 | global batch size: 48 | lm loss: 6.284460E+00 | loss scale: 16384.0 | grad norm: 135151.282 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4830/ 159576 | consumed samples: 113712 | elapsed time per iteration (ms): 15453.2 | learning rate: 3.148E-05 | global batch size: 48 | lm loss: 6.389065E+00 | loss scale: 16384.0 | grad norm: 158822.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4831/ 159576 | consumed samples: 113760 | elapsed time per iteration (ms): 15757.8 | learning rate: 3.150E-05 | global batch size: 48 | lm loss: 6.330949E+00 | loss scale: 16384.0 | grad norm: 150077.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4832/ 159576 | consumed samples: 113808 | elapsed time per iteration (ms): 8582.4 | learning rate: 3.150E-05 | global batch size: 48 | lm loss: 6.330990E+00 | loss scale: 8192.0 | grad norm: 150077.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4833/ 159576 | consumed samples: 113856 | elapsed time per iteration (ms): 14858.8 | learning rate: 3.151E-05 | global batch size: 48 | lm loss: 6.472740E+00 | loss scale: 8192.0 | grad norm: 80806.673 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4834/ 159576 | consumed samples: 113904 | elapsed time per iteration (ms): 15406.5 | learning rate: 3.152E-05 | global batch size: 48 | lm loss: 6.386261E+00 | loss scale: 8192.0 | grad norm: 79982.750 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4835/ 159576 | consumed samples: 113952 | elapsed time per iteration (ms): 15754.6 | learning rate: 3.154E-05 | global batch size: 48 | lm loss: 6.399200E+00 | loss scale: 8192.0 | grad norm: 76427.802 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4836/ 159576 | consumed samples: 114000 | elapsed time per iteration (ms): 15606.6 | learning rate: 3.155E-05 | global batch size: 48 | lm loss: 6.377688E+00 | loss scale: 8192.0 | grad norm: 72730.651 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4837/ 159576 | consumed samples: 114048 | elapsed time per iteration (ms): 15427.9 | learning rate: 3.156E-05 | global batch size: 48 | lm loss: 6.362796E+00 | loss scale: 8192.0 | grad norm: 75031.879 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4838/ 159576 | consumed samples: 114096 | elapsed time per iteration (ms): 15459.9 | learning rate: 3.158E-05 | global batch size: 48 | lm loss: 6.427638E+00 | loss scale: 8192.0 | grad norm: 71627.109 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4839/ 159576 | consumed samples: 114144 | elapsed time per iteration (ms): 15785.4 | learning rate: 3.159E-05 | global batch size: 48 | lm loss: 6.319674E+00 | loss scale: 8192.0 | grad norm: 75857.181 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4840/ 159576 | consumed samples: 114192 | elapsed time per iteration (ms): 15529.1 | learning rate: 3.160E-05 | global batch size: 48 | lm loss: 6.453057E+00 | loss scale: 8192.0 | grad norm: 81110.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4841/ 159576 | consumed samples: 114240 | elapsed time per iteration (ms): 15426.5 | learning rate: 3.162E-05 | global batch size: 48 | lm loss: 6.411851E+00 | loss scale: 8192.0 | grad norm: 86983.700 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4842/ 159576 | consumed samples: 114288 | elapsed time per iteration (ms): 15460.5 | learning rate: 3.163E-05 | global batch size: 48 | lm loss: 6.377954E+00 | loss scale: 8192.0 | grad norm: 86981.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4843/ 159576 | consumed samples: 114336 | elapsed time per iteration (ms): 15821.2 | learning rate: 3.164E-05 | global batch size: 48 | lm loss: 6.577933E+00 | loss scale: 8192.0 | grad norm: 91346.895 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4844/ 159576 | consumed samples: 114384 | elapsed time per iteration (ms): 15501.1 | learning rate: 3.166E-05 | global batch size: 48 | lm loss: 6.404775E+00 | loss scale: 8192.0 | grad norm: 73191.069 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4845/ 159576 | consumed samples: 114432 | elapsed time per iteration (ms): 15559.3 | learning rate: 3.167E-05 | global batch size: 48 | lm loss: 6.405911E+00 | loss scale: 8192.0 | grad norm: 77252.377 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4846/ 159576 | consumed samples: 114480 | elapsed time per iteration (ms): 15521.7 | learning rate: 3.168E-05 | global batch size: 48 | lm loss: 6.505279E+00 | loss scale: 8192.0 | grad norm: 70335.265 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4847/ 159576 | consumed samples: 114528 | elapsed time per iteration (ms): 15925.0 | learning rate: 3.170E-05 | global batch size: 48 | lm loss: 6.438465E+00 | loss scale: 8192.0 | grad norm: 73213.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4848/ 159576 | consumed samples: 114576 | elapsed time per iteration (ms): 15612.2 | learning rate: 3.171E-05 | global batch size: 48 | lm loss: 6.452498E+00 | loss scale: 8192.0 | grad norm: 78502.943 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4849/ 159576 | consumed samples: 114624 | elapsed time per iteration (ms): 15443.4 | learning rate: 3.172E-05 | global batch size: 48 | lm loss: 6.394375E+00 | loss scale: 8192.0 | grad norm: 87781.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4850/ 159576 | consumed samples: 114672 | elapsed time per iteration (ms): 15479.4 | learning rate: 3.174E-05 | global batch size: 48 | lm loss: 6.435881E+00 | loss scale: 8192.0 | grad norm: 73932.494 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4851/ 159576 | consumed samples: 114720 | elapsed time per iteration (ms): 15706.9 | learning rate: 3.175E-05 | global batch size: 48 | lm loss: 6.482435E+00 | loss scale: 8192.0 | grad norm: 80407.010 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4852/ 159576 | consumed samples: 114768 | elapsed time per iteration (ms): 15526.6 | learning rate: 3.176E-05 | global batch size: 48 | lm loss: 6.479346E+00 | loss scale: 8192.0 | grad norm: 88804.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4853/ 159576 | consumed samples: 114816 | elapsed time per iteration (ms): 15581.7 | learning rate: 3.178E-05 | global batch size: 48 | lm loss: 6.398011E+00 | loss scale: 8192.0 | grad norm: 85238.079 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4854/ 159576 | consumed samples: 114864 | elapsed time per iteration (ms): 15591.6 | learning rate: 3.179E-05 | global batch size: 48 | lm loss: 6.439957E+00 | loss scale: 8192.0 | grad norm: 79088.978 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4855/ 159576 | consumed samples: 114912 | elapsed time per iteration (ms): 15588.2 | learning rate: 3.180E-05 | global batch size: 48 | lm loss: 6.525852E+00 | loss scale: 8192.0 | grad norm: 86759.095 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4856/ 159576 | consumed samples: 114960 | elapsed time per iteration (ms): 15491.8 | learning rate: 3.182E-05 | global batch size: 48 | lm loss: 6.406517E+00 | loss scale: 8192.0 | grad norm: 84644.761 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4857/ 159576 | consumed samples: 115008 | elapsed time per iteration (ms): 15455.8 | learning rate: 3.183E-05 | global batch size: 48 | lm loss: 6.427845E+00 | loss scale: 8192.0 | grad norm: 95490.221 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4858/ 159576 | consumed samples: 115056 | elapsed time per iteration (ms): 15508.2 | learning rate: 3.184E-05 | global batch size: 48 | lm loss: 6.500411E+00 | loss scale: 8192.0 | grad norm: 101236.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4859/ 159576 | consumed samples: 115104 | elapsed time per iteration (ms): 15652.7 | learning rate: 3.186E-05 | global batch size: 48 | lm loss: 6.364994E+00 | loss scale: 8192.0 | grad norm: 91582.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4860/ 159576 | consumed samples: 115152 | elapsed time per iteration (ms): 15517.9 | learning rate: 3.187E-05 | global batch size: 48 | lm loss: 6.449871E+00 | loss scale: 8192.0 | grad norm: 66096.086 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4861/ 159576 | consumed samples: 115200 | elapsed time per iteration (ms): 15569.1 | learning rate: 3.188E-05 | global batch size: 48 | lm loss: 6.364583E+00 | loss scale: 8192.0 | grad norm: 83574.580 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4862/ 159576 | consumed samples: 115248 | elapsed time per iteration (ms): 15872.9 | learning rate: 3.189E-05 | global batch size: 48 | lm loss: 6.322206E+00 | loss scale: 8192.0 | grad norm: 76576.722 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4863/ 159576 | consumed samples: 115296 | elapsed time per iteration (ms): 15519.6 | learning rate: 3.191E-05 | global batch size: 48 | lm loss: 6.475718E+00 | loss scale: 8192.0 | grad norm: 68002.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4864/ 159576 | consumed samples: 115344 | elapsed time per iteration (ms): 15516.6 | learning rate: 3.192E-05 | global batch size: 48 | lm loss: 6.312770E+00 | loss scale: 8192.0 | grad norm: 83359.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4865/ 159576 | consumed samples: 115392 | elapsed time per iteration (ms): 15489.9 | learning rate: 3.193E-05 | global batch size: 48 | lm loss: 6.447346E+00 | loss scale: 8192.0 | grad norm: 79898.278 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4866/ 159576 | consumed samples: 115440 | elapsed time per iteration (ms): 15854.0 | learning rate: 3.195E-05 | global batch size: 48 | lm loss: 6.343767E+00 | loss scale: 8192.0 | grad norm: 82915.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4867/ 159576 | consumed samples: 115488 | elapsed time per iteration (ms): 15538.2 | learning rate: 3.196E-05 | global batch size: 48 | lm loss: 6.421945E+00 | loss scale: 8192.0 | grad norm: 76629.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4868/ 159576 | consumed samples: 115536 | elapsed time per iteration (ms): 15524.2 | learning rate: 3.197E-05 | global batch size: 48 | lm loss: 6.402726E+00 | loss scale: 8192.0 | grad norm: 75429.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4869/ 159576 | consumed samples: 115584 | elapsed time per iteration (ms): 15553.9 | learning rate: 3.199E-05 | global batch size: 48 | lm loss: 6.417988E+00 | loss scale: 8192.0 | grad norm: 82790.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4870/ 159576 | consumed samples: 115632 | elapsed time per iteration (ms): 15916.9 | learning rate: 3.200E-05 | global batch size: 48 | lm loss: 6.289523E+00 | loss scale: 8192.0 | grad norm: 77156.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4871/ 159576 | consumed samples: 115680 | elapsed time per iteration (ms): 15548.8 | learning rate: 3.201E-05 | global batch size: 48 | lm loss: 6.359477E+00 | loss scale: 8192.0 | grad norm: 94063.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4872/ 159576 | consumed samples: 115728 | elapsed time per iteration (ms): 15482.5 | learning rate: 3.203E-05 | global batch size: 48 | lm loss: 6.386482E+00 | loss scale: 8192.0 | grad norm: 70658.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4873/ 159576 | consumed samples: 115776 | elapsed time per iteration (ms): 15555.0 | learning rate: 3.204E-05 | global batch size: 48 | lm loss: 6.524825E+00 | loss scale: 8192.0 | grad norm: 86322.654 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4874/ 159576 | consumed samples: 115824 | elapsed time per iteration (ms): 15950.6 | learning rate: 3.205E-05 | global batch size: 48 | lm loss: 6.358710E+00 | loss scale: 8192.0 | grad norm: 73619.690 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4875/ 159576 | consumed samples: 115872 | elapsed time per iteration (ms): 15559.5 | learning rate: 3.207E-05 | global batch size: 48 | lm loss: 6.536497E+00 | loss scale: 8192.0 | grad norm: 89786.377 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4876/ 159576 | consumed samples: 115920 | elapsed time per iteration (ms): 15463.5 | learning rate: 3.208E-05 | global batch size: 48 | lm loss: 6.427877E+00 | loss scale: 8192.0 | grad norm: 78839.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4877/ 159576 | consumed samples: 115968 | elapsed time per iteration (ms): 15525.4 | learning rate: 3.209E-05 | global batch size: 48 | lm loss: 6.471958E+00 | loss scale: 8192.0 | grad norm: 76472.776 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4878/ 159576 | consumed samples: 116016 | elapsed time per iteration (ms): 15732.8 | learning rate: 3.211E-05 | global batch size: 48 | lm loss: 6.437389E+00 | loss scale: 8192.0 | grad norm: 86320.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4879/ 159576 | consumed samples: 116064 | elapsed time per iteration (ms): 15464.9 | learning rate: 3.212E-05 | global batch size: 48 | lm loss: 6.365283E+00 | loss scale: 8192.0 | grad norm: 82080.986 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4880/ 159576 | consumed samples: 116112 | elapsed time per iteration (ms): 15552.2 | learning rate: 3.213E-05 | global batch size: 48 | lm loss: 6.408097E+00 | loss scale: 8192.0 | grad norm: 79728.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4881/ 159576 | consumed samples: 116160 | elapsed time per iteration (ms): 15532.2 | learning rate: 3.215E-05 | global batch size: 48 | lm loss: 6.425485E+00 | loss scale: 8192.0 | grad norm: 102265.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4882/ 159576 | consumed samples: 116208 | elapsed time per iteration (ms): 15707.7 | learning rate: 3.216E-05 | global batch size: 48 | lm loss: 6.276470E+00 | loss scale: 8192.0 | grad norm: 93438.364 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4883/ 159576 | consumed samples: 116256 | elapsed time per iteration (ms): 15592.8 | learning rate: 3.217E-05 | global batch size: 48 | lm loss: 6.487882E+00 | loss scale: 8192.0 | grad norm: 85760.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4884/ 159576 | consumed samples: 116304 | elapsed time per iteration (ms): 15486.2 | learning rate: 3.219E-05 | global batch size: 48 | lm loss: 6.412776E+00 | loss scale: 8192.0 | grad norm: 84281.777 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4885/ 159576 | consumed samples: 116352 | elapsed time per iteration (ms): 15807.2 | learning rate: 3.220E-05 | global batch size: 48 | lm loss: 6.340213E+00 | loss scale: 8192.0 | grad norm: 79000.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4886/ 159576 | consumed samples: 116400 | elapsed time per iteration (ms): 15690.6 | learning rate: 3.221E-05 | global batch size: 48 | lm loss: 6.368945E+00 | loss scale: 8192.0 | grad norm: 101421.475 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4887/ 159576 | consumed samples: 116448 | elapsed time per iteration (ms): 15490.9 | learning rate: 3.223E-05 | global batch size: 48 | lm loss: 6.181931E+00 | loss scale: 8192.0 | grad norm: 80306.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4888/ 159576 | consumed samples: 116496 | elapsed time per iteration (ms): 15541.0 | learning rate: 3.224E-05 | global batch size: 48 | lm loss: 6.508174E+00 | loss scale: 8192.0 | grad norm: 88863.260 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4889/ 159576 | consumed samples: 116544 | elapsed time per iteration (ms): 15795.9 | learning rate: 3.225E-05 | global batch size: 48 | lm loss: 6.362309E+00 | loss scale: 8192.0 | grad norm: 82730.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4890/ 159576 | consumed samples: 116592 | elapsed time per iteration (ms): 15612.5 | learning rate: 3.227E-05 | global batch size: 48 | lm loss: 6.457442E+00 | loss scale: 8192.0 | grad norm: 77751.832 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4891/ 159576 | consumed samples: 116640 | elapsed time per iteration (ms): 15523.7 | learning rate: 3.228E-05 | global batch size: 48 | lm loss: 6.382168E+00 | loss scale: 8192.0 | grad norm: 95335.147 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4892/ 159576 | consumed samples: 116688 | elapsed time per iteration (ms): 15565.3 | learning rate: 3.229E-05 | global batch size: 48 | lm loss: 6.443634E+00 | loss scale: 8192.0 | grad norm: 141532.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4893/ 159576 | consumed samples: 116736 | elapsed time per iteration (ms): 15920.8 | learning rate: 3.231E-05 | global batch size: 48 | lm loss: 6.475467E+00 | loss scale: 8192.0 | grad norm: 99006.769 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4894/ 159576 | consumed samples: 116784 | elapsed time per iteration (ms): 15438.9 | learning rate: 3.232E-05 | global batch size: 48 | lm loss: 6.465964E+00 | loss scale: 8192.0 | grad norm: 104819.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4895/ 159576 | consumed samples: 116832 | elapsed time per iteration (ms): 15486.6 | learning rate: 3.233E-05 | global batch size: 48 | lm loss: 6.355396E+00 | loss scale: 8192.0 | grad norm: 88645.070 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4896/ 159576 | consumed samples: 116880 | elapsed time per iteration (ms): 15530.2 | learning rate: 3.235E-05 | global batch size: 48 | lm loss: 6.397956E+00 | loss scale: 8192.0 | grad norm: 97080.394 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4897/ 159576 | consumed samples: 116928 | elapsed time per iteration (ms): 15972.1 | learning rate: 3.236E-05 | global batch size: 48 | lm loss: 6.376213E+00 | loss scale: 8192.0 | grad norm: 91571.932 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4898/ 159576 | consumed samples: 116976 | elapsed time per iteration (ms): 15582.4 | learning rate: 3.237E-05 | global batch size: 48 | lm loss: 6.338162E+00 | loss scale: 8192.0 | grad norm: 95029.310 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4899/ 159576 | consumed samples: 117024 | elapsed time per iteration (ms): 15514.7 | learning rate: 3.239E-05 | global batch size: 48 | lm loss: 6.420194E+00 | loss scale: 8192.0 | grad norm: 115966.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4900/ 159576 | consumed samples: 117072 | elapsed time per iteration (ms): 15492.3 | learning rate: 3.240E-05 | global batch size: 48 | lm loss: 6.472268E+00 | loss scale: 8192.0 | grad norm: 117112.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4901/ 159576 | consumed samples: 117120 | elapsed time per iteration (ms): 15707.8 | learning rate: 3.241E-05 | global batch size: 48 | lm loss: 6.365590E+00 | loss scale: 8192.0 | grad norm: 126111.497 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4902/ 159576 | consumed samples: 117168 | elapsed time per iteration (ms): 15440.6 | learning rate: 3.243E-05 | global batch size: 48 | lm loss: 6.341323E+00 | loss scale: 8192.0 | grad norm: 141040.178 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4903/ 159576 | consumed samples: 117216 | elapsed time per iteration (ms): 15486.6 | learning rate: 3.244E-05 | global batch size: 48 | lm loss: 6.294356E+00 | loss scale: 8192.0 | grad norm: 92893.758 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4904/ 159576 | consumed samples: 117264 | elapsed time per iteration (ms): 15374.1 | learning rate: 3.245E-05 | global batch size: 48 | lm loss: 6.459288E+00 | loss scale: 8192.0 | grad norm: 105593.680 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4905/ 159576 | consumed samples: 117312 | elapsed time per iteration (ms): 15525.3 | learning rate: 3.247E-05 | global batch size: 48 | lm loss: 6.321597E+00 | loss scale: 8192.0 | grad norm: 92345.299 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4906/ 159576 | consumed samples: 117360 | elapsed time per iteration (ms): 15464.1 | learning rate: 3.248E-05 | global batch size: 48 | lm loss: 6.394690E+00 | loss scale: 8192.0 | grad norm: 115046.817 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4907/ 159576 | consumed samples: 117408 | elapsed time per iteration (ms): 15463.2 | learning rate: 3.249E-05 | global batch size: 48 | lm loss: 6.382209E+00 | loss scale: 8192.0 | grad norm: 129712.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4908/ 159576 | consumed samples: 117456 | elapsed time per iteration (ms): 15513.8 | learning rate: 3.251E-05 | global batch size: 48 | lm loss: 6.406621E+00 | loss scale: 8192.0 | grad norm: 97342.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4909/ 159576 | consumed samples: 117504 | elapsed time per iteration (ms): 15695.2 | learning rate: 3.252E-05 | global batch size: 48 | lm loss: 6.313143E+00 | loss scale: 8192.0 | grad norm: 113026.841 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4910/ 159576 | consumed samples: 117552 | elapsed time per iteration (ms): 15443.0 | learning rate: 3.253E-05 | global batch size: 48 | lm loss: 6.450486E+00 | loss scale: 8192.0 | grad norm: 95063.553 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4911/ 159576 | consumed samples: 117600 | elapsed time per iteration (ms): 15416.6 | learning rate: 3.255E-05 | global batch size: 48 | lm loss: 6.485876E+00 | loss scale: 8192.0 | grad norm: 102064.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4912/ 159576 | consumed samples: 117648 | elapsed time per iteration (ms): 15823.7 | learning rate: 3.256E-05 | global batch size: 48 | lm loss: 6.276315E+00 | loss scale: 8192.0 | grad norm: 114959.499 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4913/ 159576 | consumed samples: 117696 | elapsed time per iteration (ms): 15625.5 | learning rate: 3.257E-05 | global batch size: 48 | lm loss: 6.405933E+00 | loss scale: 8192.0 | grad norm: 117232.383 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4914/ 159576 | consumed samples: 117744 | elapsed time per iteration (ms): 15455.3 | learning rate: 3.259E-05 | global batch size: 48 | lm loss: 6.233083E+00 | loss scale: 8192.0 | grad norm: 109853.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4915/ 159576 | consumed samples: 117792 | elapsed time per iteration (ms): 15594.3 | learning rate: 3.260E-05 | global batch size: 48 | lm loss: 6.418136E+00 | loss scale: 8192.0 | grad norm: 108180.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4916/ 159576 | consumed samples: 117840 | elapsed time per iteration (ms): 15954.3 | learning rate: 3.261E-05 | global batch size: 48 | lm loss: 6.385183E+00 | loss scale: 8192.0 | grad norm: 103614.011 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4917/ 159576 | consumed samples: 117888 | elapsed time per iteration (ms): 15458.8 | learning rate: 3.263E-05 | global batch size: 48 | lm loss: 6.341071E+00 | loss scale: 8192.0 | grad norm: 87833.153 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4918/ 159576 | consumed samples: 117936 | elapsed time per iteration (ms): 15501.3 | learning rate: 3.264E-05 | global batch size: 48 | lm loss: 6.418250E+00 | loss scale: 8192.0 | grad norm: 91681.912 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4919/ 159576 | consumed samples: 117984 | elapsed time per iteration (ms): 15446.3 | learning rate: 3.265E-05 | global batch size: 48 | lm loss: 6.298886E+00 | loss scale: 8192.0 | grad norm: 98048.635 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4920/ 159576 | consumed samples: 118032 | elapsed time per iteration (ms): 15905.0 | learning rate: 3.267E-05 | global batch size: 48 | lm loss: 6.413123E+00 | loss scale: 8192.0 | grad norm: 103541.248 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4921/ 159576 | consumed samples: 118080 | elapsed time per iteration (ms): 15416.1 | learning rate: 3.268E-05 | global batch size: 48 | lm loss: 6.282074E+00 | loss scale: 8192.0 | grad norm: 100452.621 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4922/ 159576 | consumed samples: 118128 | elapsed time per iteration (ms): 15499.9 | learning rate: 3.269E-05 | global batch size: 48 | lm loss: 6.371088E+00 | loss scale: 8192.0 | grad norm: 118401.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4923/ 159576 | consumed samples: 118176 | elapsed time per iteration (ms): 15522.6 | learning rate: 3.271E-05 | global batch size: 48 | lm loss: 6.399379E+00 | loss scale: 8192.0 | grad norm: 100877.549 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4924/ 159576 | consumed samples: 118224 | elapsed time per iteration (ms): 15859.1 | learning rate: 3.272E-05 | global batch size: 48 | lm loss: 6.450886E+00 | loss scale: 8192.0 | grad norm: 115997.698 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4925/ 159576 | consumed samples: 118272 | elapsed time per iteration (ms): 15622.0 | learning rate: 3.273E-05 | global batch size: 48 | lm loss: 6.412412E+00 | loss scale: 8192.0 | grad norm: 121229.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4926/ 159576 | consumed samples: 118320 | elapsed time per iteration (ms): 15522.5 | learning rate: 3.275E-05 | global batch size: 48 | lm loss: 6.276751E+00 | loss scale: 8192.0 | grad norm: 127323.029 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4927/ 159576 | consumed samples: 118368 | elapsed time per iteration (ms): 15489.0 | learning rate: 3.276E-05 | global batch size: 48 | lm loss: 6.328137E+00 | loss scale: 8192.0 | grad norm: 109231.572 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4928/ 159576 | consumed samples: 118416 | elapsed time per iteration (ms): 15679.3 | learning rate: 3.277E-05 | global batch size: 48 | lm loss: 6.343997E+00 | loss scale: 8192.0 | grad norm: 94463.087 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4929/ 159576 | consumed samples: 118464 | elapsed time per iteration (ms): 15506.4 | learning rate: 3.279E-05 | global batch size: 48 | lm loss: 6.367960E+00 | loss scale: 8192.0 | grad norm: 104644.038 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4930/ 159576 | consumed samples: 118512 | elapsed time per iteration (ms): 15552.6 | learning rate: 3.280E-05 | global batch size: 48 | lm loss: 6.375040E+00 | loss scale: 8192.0 | grad norm: 108080.731 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4931/ 159576 | consumed samples: 118560 | elapsed time per iteration (ms): 15566.6 | learning rate: 3.281E-05 | global batch size: 48 | lm loss: 6.468022E+00 | loss scale: 8192.0 | grad norm: 98813.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4932/ 159576 | consumed samples: 118608 | elapsed time per iteration (ms): 15633.8 | learning rate: 3.283E-05 | global batch size: 48 | lm loss: 6.478949E+00 | loss scale: 8192.0 | grad norm: 119522.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4933/ 159576 | consumed samples: 118656 | elapsed time per iteration (ms): 15451.3 | learning rate: 3.284E-05 | global batch size: 48 | lm loss: 6.415487E+00 | loss scale: 8192.0 | grad norm: 121029.519 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4934/ 159576 | consumed samples: 118704 | elapsed time per iteration (ms): 15537.9 | learning rate: 3.285E-05 | global batch size: 48 | lm loss: 6.436414E+00 | loss scale: 8192.0 | grad norm: 114108.101 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4935/ 159576 | consumed samples: 118752 | elapsed time per iteration (ms): 15442.4 | learning rate: 3.287E-05 | global batch size: 48 | lm loss: 6.380546E+00 | loss scale: 8192.0 | grad norm: 102153.332 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4936/ 159576 | consumed samples: 118800 | elapsed time per iteration (ms): 15674.3 | learning rate: 3.288E-05 | global batch size: 48 | lm loss: 6.524636E+00 | loss scale: 8192.0 | grad norm: 89702.742 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4937/ 159576 | consumed samples: 118848 | elapsed time per iteration (ms): 15501.6 | learning rate: 3.289E-05 | global batch size: 48 | lm loss: 6.352899E+00 | loss scale: 8192.0 | grad norm: 106241.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4938/ 159576 | consumed samples: 118896 | elapsed time per iteration (ms): 15494.9 | learning rate: 3.291E-05 | global batch size: 48 | lm loss: 6.292633E+00 | loss scale: 8192.0 | grad norm: 95129.966 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4939/ 159576 | consumed samples: 118944 | elapsed time per iteration (ms): 15936.8 | learning rate: 3.292E-05 | global batch size: 48 | lm loss: 6.337314E+00 | loss scale: 8192.0 | grad norm: 120723.828 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4940/ 159576 | consumed samples: 118992 | elapsed time per iteration (ms): 15531.1 | learning rate: 3.293E-05 | global batch size: 48 | lm loss: 6.391080E+00 | loss scale: 8192.0 | grad norm: 145548.804 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4941/ 159576 | consumed samples: 119040 | elapsed time per iteration (ms): 15466.0 | learning rate: 3.295E-05 | global batch size: 48 | lm loss: 6.343481E+00 | loss scale: 8192.0 | grad norm: 211104.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4942/ 159576 | consumed samples: 119088 | elapsed time per iteration (ms): 15505.4 | learning rate: 3.296E-05 | global batch size: 48 | lm loss: 6.528688E+00 | loss scale: 8192.0 | grad norm: 140909.560 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4943/ 159576 | consumed samples: 119136 | elapsed time per iteration (ms): 15830.2 | learning rate: 3.297E-05 | global batch size: 48 | lm loss: 6.411016E+00 | loss scale: 8192.0 | grad norm: 127370.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4944/ 159576 | consumed samples: 119184 | elapsed time per iteration (ms): 15400.2 | learning rate: 3.299E-05 | global batch size: 48 | lm loss: 6.483131E+00 | loss scale: 8192.0 | grad norm: 104651.898 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4945/ 159576 | consumed samples: 119232 | elapsed time per iteration (ms): 15491.5 | learning rate: 3.300E-05 | global batch size: 48 | lm loss: 6.509373E+00 | loss scale: 8192.0 | grad norm: 129067.934 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4946/ 159576 | consumed samples: 119280 | elapsed time per iteration (ms): 15557.0 | learning rate: 3.301E-05 | global batch size: 48 | lm loss: 6.338033E+00 | loss scale: 8192.0 | grad norm: 111737.692 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4947/ 159576 | consumed samples: 119328 | elapsed time per iteration (ms): 15880.4 | learning rate: 3.303E-05 | global batch size: 48 | lm loss: 6.346412E+00 | loss scale: 8192.0 | grad norm: 105173.160 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4948/ 159576 | consumed samples: 119376 | elapsed time per iteration (ms): 15470.3 | learning rate: 3.304E-05 | global batch size: 48 | lm loss: 6.433241E+00 | loss scale: 8192.0 | grad norm: 117253.932 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4949/ 159576 | consumed samples: 119424 | elapsed time per iteration (ms): 15464.0 | learning rate: 3.305E-05 | global batch size: 48 | lm loss: 6.408391E+00 | loss scale: 8192.0 | grad norm: 100408.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4950/ 159576 | consumed samples: 119472 | elapsed time per iteration (ms): 15498.5 | learning rate: 3.307E-05 | global batch size: 48 | lm loss: 6.403716E+00 | loss scale: 8192.0 | grad norm: 124240.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4951/ 159576 | consumed samples: 119520 | elapsed time per iteration (ms): 15815.9 | learning rate: 3.308E-05 | global batch size: 48 | lm loss: 6.389519E+00 | loss scale: 8192.0 | grad norm: 100463.890 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4952/ 159576 | consumed samples: 119568 | elapsed time per iteration (ms): 15557.3 | learning rate: 3.309E-05 | global batch size: 48 | lm loss: 6.505785E+00 | loss scale: 8192.0 | grad norm: 106487.068 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4953/ 159576 | consumed samples: 119616 | elapsed time per iteration (ms): 15479.5 | learning rate: 3.311E-05 | global batch size: 48 | lm loss: 6.381755E+00 | loss scale: 8192.0 | grad norm: 102228.411 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4954/ 159576 | consumed samples: 119664 | elapsed time per iteration (ms): 15481.8 | learning rate: 3.312E-05 | global batch size: 48 | lm loss: 6.379836E+00 | loss scale: 8192.0 | grad norm: 118394.733 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4955/ 159576 | consumed samples: 119712 | elapsed time per iteration (ms): 15784.5 | learning rate: 3.313E-05 | global batch size: 48 | lm loss: 6.475849E+00 | loss scale: 8192.0 | grad norm: 122087.327 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4956/ 159576 | consumed samples: 119760 | elapsed time per iteration (ms): 15436.2 | learning rate: 3.315E-05 | global batch size: 48 | lm loss: 6.490977E+00 | loss scale: 8192.0 | grad norm: 123577.161 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4957/ 159576 | consumed samples: 119808 | elapsed time per iteration (ms): 15420.1 | learning rate: 3.316E-05 | global batch size: 48 | lm loss: 6.418243E+00 | loss scale: 8192.0 | grad norm: 146260.906 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4958/ 159576 | consumed samples: 119856 | elapsed time per iteration (ms): 15433.1 | learning rate: 3.317E-05 | global batch size: 48 | lm loss: 6.375823E+00 | loss scale: 8192.0 | grad norm: 102943.358 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4959/ 159576 | consumed samples: 119904 | elapsed time per iteration (ms): 15549.7 | learning rate: 3.319E-05 | global batch size: 48 | lm loss: 6.454865E+00 | loss scale: 8192.0 | grad norm: 95733.097 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4960/ 159576 | consumed samples: 119952 | elapsed time per iteration (ms): 15477.0 | learning rate: 3.320E-05 | global batch size: 48 | lm loss: 6.376845E+00 | loss scale: 8192.0 | grad norm: 105409.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4961/ 159576 | consumed samples: 120000 | elapsed time per iteration (ms): 15553.6 | learning rate: 3.321E-05 | global batch size: 48 | lm loss: 6.369764E+00 | loss scale: 8192.0 | grad norm: 100426.286 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4962/ 159576 | consumed samples: 120048 | elapsed time per iteration (ms): 15567.9 | learning rate: 3.323E-05 | global batch size: 48 | lm loss: 6.386555E+00 | loss scale: 8192.0 | grad norm: 100112.758 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4963/ 159576 | consumed samples: 120096 | elapsed time per iteration (ms): 15733.5 | learning rate: 3.324E-05 | global batch size: 48 | lm loss: 6.487816E+00 | loss scale: 8192.0 | grad norm: 117343.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4964/ 159576 | consumed samples: 120144 | elapsed time per iteration (ms): 15368.5 | learning rate: 3.325E-05 | global batch size: 48 | lm loss: 6.415962E+00 | loss scale: 8192.0 | grad norm: 98866.878 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4965/ 159576 | consumed samples: 120192 | elapsed time per iteration (ms): 15477.1 | learning rate: 3.327E-05 | global batch size: 48 | lm loss: 6.374081E+00 | loss scale: 8192.0 | grad norm: 124767.543 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4966/ 159576 | consumed samples: 120240 | elapsed time per iteration (ms): 15922.3 | learning rate: 3.328E-05 | global batch size: 48 | lm loss: 6.338925E+00 | loss scale: 8192.0 | grad norm: 229637.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4967/ 159576 | consumed samples: 120288 | elapsed time per iteration (ms): 15438.9 | learning rate: 3.329E-05 | global batch size: 48 | lm loss: 6.318257E+00 | loss scale: 8192.0 | grad norm: 138618.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4968/ 159576 | consumed samples: 120336 | elapsed time per iteration (ms): 15527.5 | learning rate: 3.331E-05 | global batch size: 48 | lm loss: 6.406815E+00 | loss scale: 8192.0 | grad norm: 101628.651 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4969/ 159576 | consumed samples: 120384 | elapsed time per iteration (ms): 15565.4 | learning rate: 3.332E-05 | global batch size: 48 | lm loss: 6.381866E+00 | loss scale: 8192.0 | grad norm: 138150.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4970/ 159576 | consumed samples: 120432 | elapsed time per iteration (ms): 15898.0 | learning rate: 3.333E-05 | global batch size: 48 | lm loss: 6.305198E+00 | loss scale: 8192.0 | grad norm: 94133.912 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4971/ 159576 | consumed samples: 120480 | elapsed time per iteration (ms): 15413.4 | learning rate: 3.335E-05 | global batch size: 48 | lm loss: 6.276737E+00 | loss scale: 8192.0 | grad norm: 89212.813 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4972/ 159576 | consumed samples: 120528 | elapsed time per iteration (ms): 15553.0 | learning rate: 3.336E-05 | global batch size: 48 | lm loss: 6.404760E+00 | loss scale: 8192.0 | grad norm: 119702.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4973/ 159576 | consumed samples: 120576 | elapsed time per iteration (ms): 15428.6 | learning rate: 3.337E-05 | global batch size: 48 | lm loss: 6.225817E+00 | loss scale: 8192.0 | grad norm: 94382.038 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4974/ 159576 | consumed samples: 120624 | elapsed time per iteration (ms): 15767.2 | learning rate: 3.339E-05 | global batch size: 48 | lm loss: 6.442757E+00 | loss scale: 8192.0 | grad norm: 99692.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4975/ 159576 | consumed samples: 120672 | elapsed time per iteration (ms): 15514.4 | learning rate: 3.340E-05 | global batch size: 48 | lm loss: 6.472607E+00 | loss scale: 8192.0 | grad norm: 112543.414 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4976/ 159576 | consumed samples: 120720 | elapsed time per iteration (ms): 15602.8 | learning rate: 3.341E-05 | global batch size: 48 | lm loss: 6.382205E+00 | loss scale: 8192.0 | grad norm: 97309.286 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4977/ 159576 | consumed samples: 120768 | elapsed time per iteration (ms): 15584.4 | learning rate: 3.343E-05 | global batch size: 48 | lm loss: 6.527099E+00 | loss scale: 8192.0 | grad norm: 91482.780 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4978/ 159576 | consumed samples: 120816 | elapsed time per iteration (ms): 15753.9 | learning rate: 3.344E-05 | global batch size: 48 | lm loss: 6.475079E+00 | loss scale: 8192.0 | grad norm: 167594.086 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4979/ 159576 | consumed samples: 120864 | elapsed time per iteration (ms): 15592.8 | learning rate: 3.345E-05 | global batch size: 48 | lm loss: 6.377496E+00 | loss scale: 8192.0 | grad norm: 94710.465 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4980/ 159576 | consumed samples: 120912 | elapsed time per iteration (ms): 15439.6 | learning rate: 3.347E-05 | global batch size: 48 | lm loss: 6.396212E+00 | loss scale: 8192.0 | grad norm: 82226.776 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4981/ 159576 | consumed samples: 120960 | elapsed time per iteration (ms): 15453.4 | learning rate: 3.348E-05 | global batch size: 48 | lm loss: 6.392390E+00 | loss scale: 8192.0 | grad norm: 93532.515 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4982/ 159576 | consumed samples: 121008 | elapsed time per iteration (ms): 15623.6 | learning rate: 3.349E-05 | global batch size: 48 | lm loss: 6.384733E+00 | loss scale: 8192.0 | grad norm: 99819.245 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4983/ 159576 | consumed samples: 121056 | elapsed time per iteration (ms): 15476.4 | learning rate: 3.351E-05 | global batch size: 48 | lm loss: 6.365707E+00 | loss scale: 8192.0 | grad norm: 115195.515 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4984/ 159576 | consumed samples: 121104 | elapsed time per iteration (ms): 15519.9 | learning rate: 3.352E-05 | global batch size: 48 | lm loss: 6.280232E+00 | loss scale: 8192.0 | grad norm: 88569.976 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4985/ 159576 | consumed samples: 121152 | elapsed time per iteration (ms): 15489.3 | learning rate: 3.353E-05 | global batch size: 48 | lm loss: 6.514761E+00 | loss scale: 8192.0 | grad norm: 110101.646 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4986/ 159576 | consumed samples: 121200 | elapsed time per iteration (ms): 15582.9 | learning rate: 3.355E-05 | global batch size: 48 | lm loss: 6.394022E+00 | loss scale: 8192.0 | grad norm: 104900.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4987/ 159576 | consumed samples: 121248 | elapsed time per iteration (ms): 15478.8 | learning rate: 3.356E-05 | global batch size: 48 | lm loss: 6.428993E+00 | loss scale: 8192.0 | grad norm: 99980.054 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4988/ 159576 | consumed samples: 121296 | elapsed time per iteration (ms): 15470.8 | learning rate: 3.357E-05 | global batch size: 48 | lm loss: 6.383337E+00 | loss scale: 8192.0 | grad norm: 96150.673 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4989/ 159576 | consumed samples: 121344 | elapsed time per iteration (ms): 15490.7 | learning rate: 3.359E-05 | global batch size: 48 | lm loss: 6.440140E+00 | loss scale: 8192.0 | grad norm: 99225.792 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4990/ 159576 | consumed samples: 121392 | elapsed time per iteration (ms): 16022.8 | learning rate: 3.360E-05 | global batch size: 48 | lm loss: 6.329103E+00 | loss scale: 8192.0 | grad norm: 77357.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4991/ 159576 | consumed samples: 121440 | elapsed time per iteration (ms): 15500.7 | learning rate: 3.361E-05 | global batch size: 48 | lm loss: 6.346808E+00 | loss scale: 8192.0 | grad norm: 83379.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4992/ 159576 | consumed samples: 121488 | elapsed time per iteration (ms): 15638.6 | learning rate: 3.363E-05 | global batch size: 48 | lm loss: 6.460890E+00 | loss scale: 8192.0 | grad norm: 114878.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4993/ 159576 | consumed samples: 121536 | elapsed time per iteration (ms): 15882.0 | learning rate: 3.364E-05 | global batch size: 48 | lm loss: 6.485402E+00 | loss scale: 8192.0 | grad norm: 164153.089 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4994/ 159576 | consumed samples: 121584 | elapsed time per iteration (ms): 15543.1 | learning rate: 3.365E-05 | global batch size: 48 | lm loss: 6.511444E+00 | loss scale: 8192.0 | grad norm: 102365.809 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4995/ 159576 | consumed samples: 121632 | elapsed time per iteration (ms): 15538.2 | learning rate: 3.367E-05 | global batch size: 48 | lm loss: 6.413379E+00 | loss scale: 8192.0 | grad norm: 115181.224 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4996/ 159576 | consumed samples: 121680 | elapsed time per iteration (ms): 15508.4 | learning rate: 3.368E-05 | global batch size: 48 | lm loss: 6.359092E+00 | loss scale: 8192.0 | grad norm: 117830.201 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4997/ 159576 | consumed samples: 121728 | elapsed time per iteration (ms): 15913.3 | learning rate: 3.369E-05 | global batch size: 48 | lm loss: 6.388143E+00 | loss scale: 8192.0 | grad norm: 90924.050 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4998/ 159576 | consumed samples: 121776 | elapsed time per iteration (ms): 15515.1 | learning rate: 3.371E-05 | global batch size: 48 | lm loss: 6.391698E+00 | loss scale: 8192.0 | grad norm: 103478.291 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 4999/ 159576 | consumed samples: 121824 | elapsed time per iteration (ms): 15550.7 | learning rate: 3.372E-05 | global batch size: 48 | lm loss: 6.279563E+00 | loss scale: 8192.0 | grad norm: 129145.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5000/ 159576 | consumed samples: 121872 | elapsed time per iteration (ms): 15649.4 | learning rate: 3.373E-05 | global batch size: 48 | lm loss: 6.273400E+00 | loss scale: 8192.0 | grad norm: 105673.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 5000 | lm loss value: 6.285937E+00 | lm loss PPL: 5.369671E+02 | ------------------------------------------------------------------------------------------------- - iteration 5001/ 159576 | consumed samples: 121920 | elapsed time per iteration (ms): 23355.2 | learning rate: 3.375E-05 | global batch size: 48 | lm loss: 6.432467E+00 | loss scale: 8192.0 | grad norm: 94517.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5002/ 159576 | consumed samples: 121968 | elapsed time per iteration (ms): 15480.8 | learning rate: 3.376E-05 | global batch size: 48 | lm loss: 6.406679E+00 | loss scale: 8192.0 | grad norm: 93989.506 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5003/ 159576 | consumed samples: 122016 | elapsed time per iteration (ms): 15462.8 | learning rate: 3.377E-05 | global batch size: 48 | lm loss: 6.425644E+00 | loss scale: 8192.0 | grad norm: 89681.033 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5004/ 159576 | consumed samples: 122064 | elapsed time per iteration (ms): 15981.7 | learning rate: 3.379E-05 | global batch size: 48 | lm loss: 6.492604E+00 | loss scale: 8192.0 | grad norm: 95165.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5005/ 159576 | consumed samples: 122112 | elapsed time per iteration (ms): 15437.2 | learning rate: 3.380E-05 | global batch size: 48 | lm loss: 6.335800E+00 | loss scale: 8192.0 | grad norm: 84441.007 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5006/ 159576 | consumed samples: 122160 | elapsed time per iteration (ms): 15473.9 | learning rate: 3.381E-05 | global batch size: 48 | lm loss: 6.304031E+00 | loss scale: 8192.0 | grad norm: 87318.237 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5007/ 159576 | consumed samples: 122208 | elapsed time per iteration (ms): 15548.0 | learning rate: 3.383E-05 | global batch size: 48 | lm loss: 6.363890E+00 | loss scale: 8192.0 | grad norm: 92281.656 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5008/ 159576 | consumed samples: 122256 | elapsed time per iteration (ms): 15796.4 | learning rate: 3.384E-05 | global batch size: 48 | lm loss: 6.347075E+00 | loss scale: 8192.0 | grad norm: 103172.108 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5009/ 159576 | consumed samples: 122304 | elapsed time per iteration (ms): 15464.5 | learning rate: 3.385E-05 | global batch size: 48 | lm loss: 6.448061E+00 | loss scale: 8192.0 | grad norm: 95534.359 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5010/ 159576 | consumed samples: 122352 | elapsed time per iteration (ms): 15447.7 | learning rate: 3.387E-05 | global batch size: 48 | lm loss: 6.328472E+00 | loss scale: 8192.0 | grad norm: 84995.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5011/ 159576 | consumed samples: 122400 | elapsed time per iteration (ms): 15420.5 | learning rate: 3.388E-05 | global batch size: 48 | lm loss: 6.340866E+00 | loss scale: 8192.0 | grad norm: 82422.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5012/ 159576 | consumed samples: 122448 | elapsed time per iteration (ms): 15839.2 | learning rate: 3.389E-05 | global batch size: 48 | lm loss: 6.397783E+00 | loss scale: 8192.0 | grad norm: 162057.226 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5013/ 159576 | consumed samples: 122496 | elapsed time per iteration (ms): 15565.6 | learning rate: 3.391E-05 | global batch size: 48 | lm loss: 6.363326E+00 | loss scale: 8192.0 | grad norm: 86690.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5014/ 159576 | consumed samples: 122544 | elapsed time per iteration (ms): 15554.7 | learning rate: 3.392E-05 | global batch size: 48 | lm loss: 6.421363E+00 | loss scale: 8192.0 | grad norm: 102318.730 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5015/ 159576 | consumed samples: 122592 | elapsed time per iteration (ms): 15616.9 | learning rate: 3.393E-05 | global batch size: 48 | lm loss: 6.322345E+00 | loss scale: 8192.0 | grad norm: 83052.732 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5016/ 159576 | consumed samples: 122640 | elapsed time per iteration (ms): 15870.8 | learning rate: 3.395E-05 | global batch size: 48 | lm loss: 6.384270E+00 | loss scale: 8192.0 | grad norm: 167288.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5017/ 159576 | consumed samples: 122688 | elapsed time per iteration (ms): 15476.4 | learning rate: 3.396E-05 | global batch size: 48 | lm loss: 6.423479E+00 | loss scale: 8192.0 | grad norm: 86029.728 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5018/ 159576 | consumed samples: 122736 | elapsed time per iteration (ms): 15464.3 | learning rate: 3.397E-05 | global batch size: 48 | lm loss: 6.393809E+00 | loss scale: 8192.0 | grad norm: 123082.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5019/ 159576 | consumed samples: 122784 | elapsed time per iteration (ms): 15459.3 | learning rate: 3.399E-05 | global batch size: 48 | lm loss: 6.420121E+00 | loss scale: 8192.0 | grad norm: 82967.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5020/ 159576 | consumed samples: 122832 | elapsed time per iteration (ms): 15660.8 | learning rate: 3.400E-05 | global batch size: 48 | lm loss: 6.436828E+00 | loss scale: 8192.0 | grad norm: 94157.906 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 22:07:41] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 22:07:41] PULSE: tr8-104B is running for 16:15:30 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 5021/ 159576 | consumed samples: 122880 | elapsed time per iteration (ms): 15506.9 | learning rate: 3.401E-05 | global batch size: 48 | lm loss: 6.230031E+00 | loss scale: 8192.0 | grad norm: 93236.907 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5022/ 159576 | consumed samples: 122928 | elapsed time per iteration (ms): 15486.4 | learning rate: 3.403E-05 | global batch size: 48 | lm loss: 6.434629E+00 | loss scale: 8192.0 | grad norm: 88122.737 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5023/ 159576 | consumed samples: 122976 | elapsed time per iteration (ms): 15558.0 | learning rate: 3.404E-05 | global batch size: 48 | lm loss: 6.447264E+00 | loss scale: 8192.0 | grad norm: 99782.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5024/ 159576 | consumed samples: 123024 | elapsed time per iteration (ms): 15657.7 | learning rate: 3.405E-05 | global batch size: 48 | lm loss: 6.403034E+00 | loss scale: 8192.0 | grad norm: 102592.242 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5025/ 159576 | consumed samples: 123072 | elapsed time per iteration (ms): 15429.0 | learning rate: 3.407E-05 | global batch size: 48 | lm loss: 6.433703E+00 | loss scale: 8192.0 | grad norm: 82492.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5026/ 159576 | consumed samples: 123120 | elapsed time per iteration (ms): 15492.8 | learning rate: 3.408E-05 | global batch size: 48 | lm loss: 6.505131E+00 | loss scale: 8192.0 | grad norm: 334700.898 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5027/ 159576 | consumed samples: 123168 | elapsed time per iteration (ms): 15456.4 | learning rate: 3.409E-05 | global batch size: 48 | lm loss: 6.312271E+00 | loss scale: 8192.0 | grad norm: 101204.541 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5028/ 159576 | consumed samples: 123216 | elapsed time per iteration (ms): 15841.8 | learning rate: 3.411E-05 | global batch size: 48 | lm loss: 6.368502E+00 | loss scale: 8192.0 | grad norm: 103816.078 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5029/ 159576 | consumed samples: 123264 | elapsed time per iteration (ms): 15474.5 | learning rate: 3.412E-05 | global batch size: 48 | lm loss: 6.350607E+00 | loss scale: 8192.0 | grad norm: 88025.860 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5030/ 159576 | consumed samples: 123312 | elapsed time per iteration (ms): 15468.9 | learning rate: 3.413E-05 | global batch size: 48 | lm loss: 6.421462E+00 | loss scale: 8192.0 | grad norm: 121501.317 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5031/ 159576 | consumed samples: 123360 | elapsed time per iteration (ms): 15894.7 | learning rate: 3.414E-05 | global batch size: 48 | lm loss: 6.452309E+00 | loss scale: 8192.0 | grad norm: 98299.215 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5032/ 159576 | consumed samples: 123408 | elapsed time per iteration (ms): 15372.6 | learning rate: 3.416E-05 | global batch size: 48 | lm loss: 6.470865E+00 | loss scale: 8192.0 | grad norm: 86033.852 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5033/ 159576 | consumed samples: 123456 | elapsed time per iteration (ms): 15386.4 | learning rate: 3.417E-05 | global batch size: 48 | lm loss: 6.358019E+00 | loss scale: 8192.0 | grad norm: 102254.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5034/ 159576 | consumed samples: 123504 | elapsed time per iteration (ms): 15445.3 | learning rate: 3.418E-05 | global batch size: 48 | lm loss: 6.501051E+00 | loss scale: 8192.0 | grad norm: 106902.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5035/ 159576 | consumed samples: 123552 | elapsed time per iteration (ms): 15687.1 | learning rate: 3.420E-05 | global batch size: 48 | lm loss: 6.441896E+00 | loss scale: 8192.0 | grad norm: 88100.171 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5036/ 159576 | consumed samples: 123600 | elapsed time per iteration (ms): 15548.9 | learning rate: 3.421E-05 | global batch size: 48 | lm loss: 6.297223E+00 | loss scale: 8192.0 | grad norm: 92260.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5037/ 159576 | consumed samples: 123648 | elapsed time per iteration (ms): 15475.3 | learning rate: 3.422E-05 | global batch size: 48 | lm loss: 6.382265E+00 | loss scale: 8192.0 | grad norm: 91449.043 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5038/ 159576 | consumed samples: 123696 | elapsed time per iteration (ms): 15468.3 | learning rate: 3.424E-05 | global batch size: 48 | lm loss: 6.354884E+00 | loss scale: 8192.0 | grad norm: 112737.531 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5039/ 159576 | consumed samples: 123744 | elapsed time per iteration (ms): 15758.7 | learning rate: 3.425E-05 | global batch size: 48 | lm loss: 6.504280E+00 | loss scale: 8192.0 | grad norm: 106073.818 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5040/ 159576 | consumed samples: 123792 | elapsed time per iteration (ms): 15421.0 | learning rate: 3.426E-05 | global batch size: 48 | lm loss: 6.361072E+00 | loss scale: 8192.0 | grad norm: 127074.088 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5041/ 159576 | consumed samples: 123840 | elapsed time per iteration (ms): 15385.1 | learning rate: 3.428E-05 | global batch size: 48 | lm loss: 6.289526E+00 | loss scale: 8192.0 | grad norm: 92444.062 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5042/ 159576 | consumed samples: 123888 | elapsed time per iteration (ms): 15433.3 | learning rate: 3.429E-05 | global batch size: 48 | lm loss: 6.276048E+00 | loss scale: 8192.0 | grad norm: 95460.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5043/ 159576 | consumed samples: 123936 | elapsed time per iteration (ms): 15839.0 | learning rate: 3.430E-05 | global batch size: 48 | lm loss: 6.447580E+00 | loss scale: 8192.0 | grad norm: 140216.976 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5044/ 159576 | consumed samples: 123984 | elapsed time per iteration (ms): 15579.5 | learning rate: 3.432E-05 | global batch size: 48 | lm loss: 6.390550E+00 | loss scale: 8192.0 | grad norm: 103110.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5045/ 159576 | consumed samples: 124032 | elapsed time per iteration (ms): 15508.8 | learning rate: 3.433E-05 | global batch size: 48 | lm loss: 6.326768E+00 | loss scale: 8192.0 | grad norm: 143773.143 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5046/ 159576 | consumed samples: 124080 | elapsed time per iteration (ms): 15498.6 | learning rate: 3.434E-05 | global batch size: 48 | lm loss: 6.474419E+00 | loss scale: 8192.0 | grad norm: 112141.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5047/ 159576 | consumed samples: 124128 | elapsed time per iteration (ms): 15657.7 | learning rate: 3.436E-05 | global batch size: 48 | lm loss: 6.411184E+00 | loss scale: 8192.0 | grad norm: 106306.407 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5048/ 159576 | consumed samples: 124176 | elapsed time per iteration (ms): 15457.2 | learning rate: 3.437E-05 | global batch size: 48 | lm loss: 6.448883E+00 | loss scale: 8192.0 | grad norm: 119234.379 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5049/ 159576 | consumed samples: 124224 | elapsed time per iteration (ms): 15413.6 | learning rate: 3.438E-05 | global batch size: 48 | lm loss: 6.307952E+00 | loss scale: 8192.0 | grad norm: 94509.642 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5050/ 159576 | consumed samples: 124272 | elapsed time per iteration (ms): 15423.5 | learning rate: 3.440E-05 | global batch size: 48 | lm loss: 6.399596E+00 | loss scale: 8192.0 | grad norm: 107196.748 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5051/ 159576 | consumed samples: 124320 | elapsed time per iteration (ms): 15555.5 | learning rate: 3.441E-05 | global batch size: 48 | lm loss: 6.345298E+00 | loss scale: 8192.0 | grad norm: 101445.269 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5052/ 159576 | consumed samples: 124368 | elapsed time per iteration (ms): 15471.9 | learning rate: 3.442E-05 | global batch size: 48 | lm loss: 6.399672E+00 | loss scale: 8192.0 | grad norm: 101071.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5053/ 159576 | consumed samples: 124416 | elapsed time per iteration (ms): 15538.7 | learning rate: 3.444E-05 | global batch size: 48 | lm loss: 6.306325E+00 | loss scale: 8192.0 | grad norm: 130980.614 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5054/ 159576 | consumed samples: 124464 | elapsed time per iteration (ms): 15446.5 | learning rate: 3.445E-05 | global batch size: 48 | lm loss: 6.360683E+00 | loss scale: 8192.0 | grad norm: 138731.319 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5055/ 159576 | consumed samples: 124512 | elapsed time per iteration (ms): 15548.6 | learning rate: 3.446E-05 | global batch size: 48 | lm loss: 6.415308E+00 | loss scale: 8192.0 | grad norm: 172722.048 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5056/ 159576 | consumed samples: 124560 | elapsed time per iteration (ms): 15454.2 | learning rate: 3.448E-05 | global batch size: 48 | lm loss: 6.446492E+00 | loss scale: 8192.0 | grad norm: 114779.854 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5057/ 159576 | consumed samples: 124608 | elapsed time per iteration (ms): 15531.5 | learning rate: 3.449E-05 | global batch size: 48 | lm loss: 6.352797E+00 | loss scale: 8192.0 | grad norm: 93911.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5058/ 159576 | consumed samples: 124656 | elapsed time per iteration (ms): 15916.6 | learning rate: 3.450E-05 | global batch size: 48 | lm loss: 6.394308E+00 | loss scale: 8192.0 | grad norm: 122896.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5059/ 159576 | consumed samples: 124704 | elapsed time per iteration (ms): 15639.0 | learning rate: 3.452E-05 | global batch size: 48 | lm loss: 6.497361E+00 | loss scale: 8192.0 | grad norm: 111301.411 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5060/ 159576 | consumed samples: 124752 | elapsed time per iteration (ms): 15585.9 | learning rate: 3.453E-05 | global batch size: 48 | lm loss: 6.416485E+00 | loss scale: 8192.0 | grad norm: 111209.944 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5061/ 159576 | consumed samples: 124800 | elapsed time per iteration (ms): 15476.2 | learning rate: 3.454E-05 | global batch size: 48 | lm loss: 6.385825E+00 | loss scale: 8192.0 | grad norm: 124134.940 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5062/ 159576 | consumed samples: 124848 | elapsed time per iteration (ms): 15734.0 | learning rate: 3.456E-05 | global batch size: 48 | lm loss: 6.419828E+00 | loss scale: 8192.0 | grad norm: 115134.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5063/ 159576 | consumed samples: 124896 | elapsed time per iteration (ms): 15427.5 | learning rate: 3.457E-05 | global batch size: 48 | lm loss: 6.501984E+00 | loss scale: 8192.0 | grad norm: 94348.909 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5064/ 159576 | consumed samples: 124944 | elapsed time per iteration (ms): 15367.7 | learning rate: 3.458E-05 | global batch size: 48 | lm loss: 6.435040E+00 | loss scale: 8192.0 | grad norm: 107056.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5065/ 159576 | consumed samples: 124992 | elapsed time per iteration (ms): 15376.7 | learning rate: 3.460E-05 | global batch size: 48 | lm loss: 6.347174E+00 | loss scale: 8192.0 | grad norm: 107513.355 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5066/ 159576 | consumed samples: 125040 | elapsed time per iteration (ms): 15861.2 | learning rate: 3.461E-05 | global batch size: 48 | lm loss: 6.473555E+00 | loss scale: 8192.0 | grad norm: 96134.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5067/ 159576 | consumed samples: 125088 | elapsed time per iteration (ms): 15376.8 | learning rate: 3.462E-05 | global batch size: 48 | lm loss: 6.364458E+00 | loss scale: 8192.0 | grad norm: 110987.016 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5068/ 159576 | consumed samples: 125136 | elapsed time per iteration (ms): 15511.1 | learning rate: 3.464E-05 | global batch size: 48 | lm loss: 6.441058E+00 | loss scale: 8192.0 | grad norm: 135931.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5069/ 159576 | consumed samples: 125184 | elapsed time per iteration (ms): 15475.4 | learning rate: 3.465E-05 | global batch size: 48 | lm loss: 6.324648E+00 | loss scale: 8192.0 | grad norm: 108716.373 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5070/ 159576 | consumed samples: 125232 | elapsed time per iteration (ms): 15862.4 | learning rate: 3.466E-05 | global batch size: 48 | lm loss: 6.318436E+00 | loss scale: 8192.0 | grad norm: 103967.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5071/ 159576 | consumed samples: 125280 | elapsed time per iteration (ms): 15504.6 | learning rate: 3.468E-05 | global batch size: 48 | lm loss: 6.395255E+00 | loss scale: 8192.0 | grad norm: 108399.090 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5072/ 159576 | consumed samples: 125328 | elapsed time per iteration (ms): 15377.1 | learning rate: 3.469E-05 | global batch size: 48 | lm loss: 6.379922E+00 | loss scale: 8192.0 | grad norm: 103462.577 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5073/ 159576 | consumed samples: 125376 | elapsed time per iteration (ms): 15411.3 | learning rate: 3.470E-05 | global batch size: 48 | lm loss: 6.396028E+00 | loss scale: 8192.0 | grad norm: 95480.077 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5074/ 159576 | consumed samples: 125424 | elapsed time per iteration (ms): 15799.1 | learning rate: 3.472E-05 | global batch size: 48 | lm loss: 6.413391E+00 | loss scale: 8192.0 | grad norm: 150193.560 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5075/ 159576 | consumed samples: 125472 | elapsed time per iteration (ms): 15638.7 | learning rate: 3.473E-05 | global batch size: 48 | lm loss: 6.308775E+00 | loss scale: 8192.0 | grad norm: 129289.081 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5076/ 159576 | consumed samples: 125520 | elapsed time per iteration (ms): 15490.0 | learning rate: 3.474E-05 | global batch size: 48 | lm loss: 6.273424E+00 | loss scale: 8192.0 | grad norm: 137408.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5077/ 159576 | consumed samples: 125568 | elapsed time per iteration (ms): 15408.8 | learning rate: 3.476E-05 | global batch size: 48 | lm loss: 6.402836E+00 | loss scale: 8192.0 | grad norm: 549435.371 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5078/ 159576 | consumed samples: 125616 | elapsed time per iteration (ms): 15586.3 | learning rate: 3.477E-05 | global batch size: 48 | lm loss: 6.309762E+00 | loss scale: 8192.0 | grad norm: 104483.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5079/ 159576 | consumed samples: 125664 | elapsed time per iteration (ms): 15542.8 | learning rate: 3.478E-05 | global batch size: 48 | lm loss: 6.315629E+00 | loss scale: 8192.0 | grad norm: 91616.745 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5080/ 159576 | consumed samples: 125712 | elapsed time per iteration (ms): 15472.1 | learning rate: 3.480E-05 | global batch size: 48 | lm loss: 6.554045E+00 | loss scale: 8192.0 | grad norm: 172370.169 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5081/ 159576 | consumed samples: 125760 | elapsed time per iteration (ms): 15563.9 | learning rate: 3.481E-05 | global batch size: 48 | lm loss: 6.355201E+00 | loss scale: 8192.0 | grad norm: 125519.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5082/ 159576 | consumed samples: 125808 | elapsed time per iteration (ms): 15777.1 | learning rate: 3.482E-05 | global batch size: 48 | lm loss: 6.435748E+00 | loss scale: 8192.0 | grad norm: 122698.397 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5083/ 159576 | consumed samples: 125856 | elapsed time per iteration (ms): 15566.4 | learning rate: 3.484E-05 | global batch size: 48 | lm loss: 6.269705E+00 | loss scale: 8192.0 | grad norm: 120100.832 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5084/ 159576 | consumed samples: 125904 | elapsed time per iteration (ms): 15633.9 | learning rate: 3.485E-05 | global batch size: 48 | lm loss: 6.357334E+00 | loss scale: 8192.0 | grad norm: 98996.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5085/ 159576 | consumed samples: 125952 | elapsed time per iteration (ms): 15985.6 | learning rate: 3.486E-05 | global batch size: 48 | lm loss: 6.393430E+00 | loss scale: 8192.0 | grad norm: 96935.838 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5086/ 159576 | consumed samples: 126000 | elapsed time per iteration (ms): 15483.1 | learning rate: 3.488E-05 | global batch size: 48 | lm loss: 6.307817E+00 | loss scale: 8192.0 | grad norm: 105392.466 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5087/ 159576 | consumed samples: 126048 | elapsed time per iteration (ms): 15492.6 | learning rate: 3.489E-05 | global batch size: 48 | lm loss: 6.307018E+00 | loss scale: 8192.0 | grad norm: 119838.229 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5088/ 159576 | consumed samples: 126096 | elapsed time per iteration (ms): 15510.3 | learning rate: 3.490E-05 | global batch size: 48 | lm loss: 6.400391E+00 | loss scale: 8192.0 | grad norm: 124265.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5089/ 159576 | consumed samples: 126144 | elapsed time per iteration (ms): 15885.9 | learning rate: 3.492E-05 | global batch size: 48 | lm loss: 6.333194E+00 | loss scale: 8192.0 | grad norm: 115702.613 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5090/ 159576 | consumed samples: 126192 | elapsed time per iteration (ms): 15544.2 | learning rate: 3.493E-05 | global batch size: 48 | lm loss: 6.331620E+00 | loss scale: 8192.0 | grad norm: 137239.041 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5091/ 159576 | consumed samples: 126240 | elapsed time per iteration (ms): 15557.8 | learning rate: 3.494E-05 | global batch size: 48 | lm loss: 6.437903E+00 | loss scale: 8192.0 | grad norm: 233688.200 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5092/ 159576 | consumed samples: 126288 | elapsed time per iteration (ms): 15511.8 | learning rate: 3.496E-05 | global batch size: 48 | lm loss: 6.421580E+00 | loss scale: 8192.0 | grad norm: 127898.243 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5093/ 159576 | consumed samples: 126336 | elapsed time per iteration (ms): 16146.9 | learning rate: 3.497E-05 | global batch size: 48 | lm loss: 6.348750E+00 | loss scale: 8192.0 | grad norm: 200287.595 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5094/ 159576 | consumed samples: 126384 | elapsed time per iteration (ms): 15650.7 | learning rate: 3.498E-05 | global batch size: 48 | lm loss: 6.384042E+00 | loss scale: 8192.0 | grad norm: 141808.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5095/ 159576 | consumed samples: 126432 | elapsed time per iteration (ms): 15549.8 | learning rate: 3.500E-05 | global batch size: 48 | lm loss: 6.380728E+00 | loss scale: 8192.0 | grad norm: 113750.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5096/ 159576 | consumed samples: 126480 | elapsed time per iteration (ms): 15494.8 | learning rate: 3.501E-05 | global batch size: 48 | lm loss: 6.329007E+00 | loss scale: 8192.0 | grad norm: 142607.603 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5097/ 159576 | consumed samples: 126528 | elapsed time per iteration (ms): 15805.4 | learning rate: 3.502E-05 | global batch size: 48 | lm loss: 6.331810E+00 | loss scale: 8192.0 | grad norm: 125989.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5098/ 159576 | consumed samples: 126576 | elapsed time per iteration (ms): 15560.8 | learning rate: 3.504E-05 | global batch size: 48 | lm loss: 6.349818E+00 | loss scale: 8192.0 | grad norm: 164955.758 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5099/ 159576 | consumed samples: 126624 | elapsed time per iteration (ms): 15574.8 | learning rate: 3.505E-05 | global batch size: 48 | lm loss: 6.511029E+00 | loss scale: 8192.0 | grad norm: 150219.938 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5100/ 159576 | consumed samples: 126672 | elapsed time per iteration (ms): 15588.9 | learning rate: 3.506E-05 | global batch size: 48 | lm loss: 6.365673E+00 | loss scale: 8192.0 | grad norm: 132801.144 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5101/ 159576 | consumed samples: 126720 | elapsed time per iteration (ms): 15620.0 | learning rate: 3.508E-05 | global batch size: 48 | lm loss: 6.393438E+00 | loss scale: 8192.0 | grad norm: 181251.963 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5102/ 159576 | consumed samples: 126768 | elapsed time per iteration (ms): 15489.4 | learning rate: 3.509E-05 | global batch size: 48 | lm loss: 6.416411E+00 | loss scale: 8192.0 | grad norm: 117102.206 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5103/ 159576 | consumed samples: 126816 | elapsed time per iteration (ms): 15557.2 | learning rate: 3.510E-05 | global batch size: 48 | lm loss: 6.328413E+00 | loss scale: 8192.0 | grad norm: 187671.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5104/ 159576 | consumed samples: 126864 | elapsed time per iteration (ms): 15527.6 | learning rate: 3.512E-05 | global batch size: 48 | lm loss: 6.465903E+00 | loss scale: 8192.0 | grad norm: 190613.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5105/ 159576 | consumed samples: 126912 | elapsed time per iteration (ms): 8977.0 | learning rate: 3.512E-05 | global batch size: 48 | lm loss: 6.508333E+00 | loss scale: 4096.0 | grad norm: 190613.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5106/ 159576 | consumed samples: 126960 | elapsed time per iteration (ms): 15010.8 | learning rate: 3.513E-05 | global batch size: 48 | lm loss: 6.436017E+00 | loss scale: 4096.0 | grad norm: 59199.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5107/ 159576 | consumed samples: 127008 | elapsed time per iteration (ms): 15527.1 | learning rate: 3.514E-05 | global batch size: 48 | lm loss: 6.357530E+00 | loss scale: 4096.0 | grad norm: 72710.163 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5108/ 159576 | consumed samples: 127056 | elapsed time per iteration (ms): 15496.3 | learning rate: 3.516E-05 | global batch size: 48 | lm loss: 6.394055E+00 | loss scale: 4096.0 | grad norm: 94748.377 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5109/ 159576 | consumed samples: 127104 | elapsed time per iteration (ms): 15957.2 | learning rate: 3.517E-05 | global batch size: 48 | lm loss: 6.443262E+00 | loss scale: 4096.0 | grad norm: 61224.800 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5110/ 159576 | consumed samples: 127152 | elapsed time per iteration (ms): 15587.8 | learning rate: 3.518E-05 | global batch size: 48 | lm loss: 6.400789E+00 | loss scale: 4096.0 | grad norm: 97179.001 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5111/ 159576 | consumed samples: 127200 | elapsed time per iteration (ms): 15522.6 | learning rate: 3.520E-05 | global batch size: 48 | lm loss: 6.368151E+00 | loss scale: 4096.0 | grad norm: 103211.934 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5112/ 159576 | consumed samples: 127248 | elapsed time per iteration (ms): 15555.5 | learning rate: 3.521E-05 | global batch size: 48 | lm loss: 6.389073E+00 | loss scale: 4096.0 | grad norm: 68143.489 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5113/ 159576 | consumed samples: 127296 | elapsed time per iteration (ms): 15672.8 | learning rate: 3.522E-05 | global batch size: 48 | lm loss: 6.453850E+00 | loss scale: 4096.0 | grad norm: 80102.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5114/ 159576 | consumed samples: 127344 | elapsed time per iteration (ms): 15462.8 | learning rate: 3.524E-05 | global batch size: 48 | lm loss: 6.448624E+00 | loss scale: 4096.0 | grad norm: 79184.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5115/ 159576 | consumed samples: 127392 | elapsed time per iteration (ms): 15488.2 | learning rate: 3.525E-05 | global batch size: 48 | lm loss: 6.440034E+00 | loss scale: 4096.0 | grad norm: 65278.408 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5116/ 159576 | consumed samples: 127440 | elapsed time per iteration (ms): 15517.5 | learning rate: 3.526E-05 | global batch size: 48 | lm loss: 6.452240E+00 | loss scale: 4096.0 | grad norm: 81154.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5117/ 159576 | consumed samples: 127488 | elapsed time per iteration (ms): 15650.3 | learning rate: 3.528E-05 | global batch size: 48 | lm loss: 6.352810E+00 | loss scale: 4096.0 | grad norm: 70667.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5118/ 159576 | consumed samples: 127536 | elapsed time per iteration (ms): 15553.2 | learning rate: 3.529E-05 | global batch size: 48 | lm loss: 6.422338E+00 | loss scale: 4096.0 | grad norm: 76003.454 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5119/ 159576 | consumed samples: 127584 | elapsed time per iteration (ms): 15525.1 | learning rate: 3.530E-05 | global batch size: 48 | lm loss: 6.345719E+00 | loss scale: 4096.0 | grad norm: 75153.995 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5120/ 159576 | consumed samples: 127632 | elapsed time per iteration (ms): 15941.5 | learning rate: 3.532E-05 | global batch size: 48 | lm loss: 6.406080E+00 | loss scale: 4096.0 | grad norm: 61393.266 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5121/ 159576 | consumed samples: 127680 | elapsed time per iteration (ms): 15581.4 | learning rate: 3.533E-05 | global batch size: 48 | lm loss: 6.333064E+00 | loss scale: 4096.0 | grad norm: 84273.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5122/ 159576 | consumed samples: 127728 | elapsed time per iteration (ms): 15534.4 | learning rate: 3.534E-05 | global batch size: 48 | lm loss: 6.430450E+00 | loss scale: 4096.0 | grad norm: 71025.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5123/ 159576 | consumed samples: 127776 | elapsed time per iteration (ms): 15491.5 | learning rate: 3.536E-05 | global batch size: 48 | lm loss: 6.372457E+00 | loss scale: 4096.0 | grad norm: 60958.441 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5124/ 159576 | consumed samples: 127824 | elapsed time per iteration (ms): 15825.8 | learning rate: 3.537E-05 | global batch size: 48 | lm loss: 6.359689E+00 | loss scale: 4096.0 | grad norm: 69184.583 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5125/ 159576 | consumed samples: 127872 | elapsed time per iteration (ms): 15572.0 | learning rate: 3.538E-05 | global batch size: 48 | lm loss: 6.354432E+00 | loss scale: 4096.0 | grad norm: 81726.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5126/ 159576 | consumed samples: 127920 | elapsed time per iteration (ms): 15546.1 | learning rate: 3.540E-05 | global batch size: 48 | lm loss: 6.383263E+00 | loss scale: 4096.0 | grad norm: 67932.048 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5127/ 159576 | consumed samples: 127968 | elapsed time per iteration (ms): 15512.5 | learning rate: 3.541E-05 | global batch size: 48 | lm loss: 6.323973E+00 | loss scale: 4096.0 | grad norm: 69551.089 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5128/ 159576 | consumed samples: 128016 | elapsed time per iteration (ms): 15872.2 | learning rate: 3.542E-05 | global batch size: 48 | lm loss: 6.384116E+00 | loss scale: 4096.0 | grad norm: 66160.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5129/ 159576 | consumed samples: 128064 | elapsed time per iteration (ms): 15540.5 | learning rate: 3.544E-05 | global batch size: 48 | lm loss: 6.273410E+00 | loss scale: 4096.0 | grad norm: 68712.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5130/ 159576 | consumed samples: 128112 | elapsed time per iteration (ms): 15510.9 | learning rate: 3.545E-05 | global batch size: 48 | lm loss: 6.393827E+00 | loss scale: 4096.0 | grad norm: 80347.476 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5131/ 159576 | consumed samples: 128160 | elapsed time per iteration (ms): 15546.9 | learning rate: 3.546E-05 | global batch size: 48 | lm loss: 6.494912E+00 | loss scale: 4096.0 | grad norm: 79601.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5132/ 159576 | consumed samples: 128208 | elapsed time per iteration (ms): 15850.8 | learning rate: 3.548E-05 | global batch size: 48 | lm loss: 6.363180E+00 | loss scale: 4096.0 | grad norm: 59957.448 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5133/ 159576 | consumed samples: 128256 | elapsed time per iteration (ms): 15572.0 | learning rate: 3.549E-05 | global batch size: 48 | lm loss: 6.361386E+00 | loss scale: 4096.0 | grad norm: 65589.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5134/ 159576 | consumed samples: 128304 | elapsed time per iteration (ms): 15554.8 | learning rate: 3.550E-05 | global batch size: 48 | lm loss: 6.338229E+00 | loss scale: 4096.0 | grad norm: 70953.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5135/ 159576 | consumed samples: 128352 | elapsed time per iteration (ms): 15508.1 | learning rate: 3.552E-05 | global batch size: 48 | lm loss: 6.265258E+00 | loss scale: 4096.0 | grad norm: 101476.397 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5136/ 159576 | consumed samples: 128400 | elapsed time per iteration (ms): 15713.9 | learning rate: 3.553E-05 | global batch size: 48 | lm loss: 6.443205E+00 | loss scale: 4096.0 | grad norm: 70676.423 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5137/ 159576 | consumed samples: 128448 | elapsed time per iteration (ms): 15500.3 | learning rate: 3.554E-05 | global batch size: 48 | lm loss: 6.297948E+00 | loss scale: 4096.0 | grad norm: 50734.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5138/ 159576 | consumed samples: 128496 | elapsed time per iteration (ms): 15505.3 | learning rate: 3.556E-05 | global batch size: 48 | lm loss: 6.343609E+00 | loss scale: 4096.0 | grad norm: 67207.942 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5139/ 159576 | consumed samples: 128544 | elapsed time per iteration (ms): 15531.1 | learning rate: 3.557E-05 | global batch size: 48 | lm loss: 6.422406E+00 | loss scale: 4096.0 | grad norm: 50444.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5140/ 159576 | consumed samples: 128592 | elapsed time per iteration (ms): 15679.9 | learning rate: 3.558E-05 | global batch size: 48 | lm loss: 6.377341E+00 | loss scale: 4096.0 | grad norm: 71866.018 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5141/ 159576 | consumed samples: 128640 | elapsed time per iteration (ms): 15549.3 | learning rate: 3.560E-05 | global batch size: 48 | lm loss: 6.403359E+00 | loss scale: 4096.0 | grad norm: 64942.411 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5142/ 159576 | consumed samples: 128688 | elapsed time per iteration (ms): 15525.2 | learning rate: 3.561E-05 | global batch size: 48 | lm loss: 6.390831E+00 | loss scale: 4096.0 | grad norm: 66674.644 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5143/ 159576 | consumed samples: 128736 | elapsed time per iteration (ms): 15540.8 | learning rate: 3.562E-05 | global batch size: 48 | lm loss: 6.391725E+00 | loss scale: 4096.0 | grad norm: 59980.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5144/ 159576 | consumed samples: 128784 | elapsed time per iteration (ms): 15885.0 | learning rate: 3.564E-05 | global batch size: 48 | lm loss: 6.459509E+00 | loss scale: 4096.0 | grad norm: 136366.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5145/ 159576 | consumed samples: 128832 | elapsed time per iteration (ms): 15452.0 | learning rate: 3.565E-05 | global batch size: 48 | lm loss: 6.528796E+00 | loss scale: 4096.0 | grad norm: 82183.349 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5146/ 159576 | consumed samples: 128880 | elapsed time per iteration (ms): 15509.1 | learning rate: 3.566E-05 | global batch size: 48 | lm loss: 6.420625E+00 | loss scale: 4096.0 | grad norm: 69812.351 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5147/ 159576 | consumed samples: 128928 | elapsed time per iteration (ms): 15918.9 | learning rate: 3.568E-05 | global batch size: 48 | lm loss: 6.436305E+00 | loss scale: 4096.0 | grad norm: 63955.498 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5148/ 159576 | consumed samples: 128976 | elapsed time per iteration (ms): 15526.4 | learning rate: 3.569E-05 | global batch size: 48 | lm loss: 6.339918E+00 | loss scale: 4096.0 | grad norm: 56857.758 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5149/ 159576 | consumed samples: 129024 | elapsed time per iteration (ms): 15529.0 | learning rate: 3.570E-05 | global batch size: 48 | lm loss: 6.345021E+00 | loss scale: 4096.0 | grad norm: 93115.718 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5150/ 159576 | consumed samples: 129072 | elapsed time per iteration (ms): 15542.6 | learning rate: 3.572E-05 | global batch size: 48 | lm loss: 6.311335E+00 | loss scale: 4096.0 | grad norm: 61629.742 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5151/ 159576 | consumed samples: 129120 | elapsed time per iteration (ms): 15904.0 | learning rate: 3.573E-05 | global batch size: 48 | lm loss: 6.397278E+00 | loss scale: 4096.0 | grad norm: 65208.827 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5152/ 159576 | consumed samples: 129168 | elapsed time per iteration (ms): 15450.1 | learning rate: 3.574E-05 | global batch size: 48 | lm loss: 6.345972E+00 | loss scale: 4096.0 | grad norm: 72003.182 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5153/ 159576 | consumed samples: 129216 | elapsed time per iteration (ms): 15533.3 | learning rate: 3.576E-05 | global batch size: 48 | lm loss: 6.411428E+00 | loss scale: 4096.0 | grad norm: 105237.969 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5154/ 159576 | consumed samples: 129264 | elapsed time per iteration (ms): 15505.2 | learning rate: 3.577E-05 | global batch size: 48 | lm loss: 6.320354E+00 | loss scale: 4096.0 | grad norm: 101458.750 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5155/ 159576 | consumed samples: 129312 | elapsed time per iteration (ms): 15994.4 | learning rate: 3.578E-05 | global batch size: 48 | lm loss: 6.453386E+00 | loss scale: 4096.0 | grad norm: 118215.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5156/ 159576 | consumed samples: 129360 | elapsed time per iteration (ms): 15565.8 | learning rate: 3.580E-05 | global batch size: 48 | lm loss: 6.443649E+00 | loss scale: 4096.0 | grad norm: 72691.423 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5157/ 159576 | consumed samples: 129408 | elapsed time per iteration (ms): 15539.2 | learning rate: 3.581E-05 | global batch size: 48 | lm loss: 6.528984E+00 | loss scale: 4096.0 | grad norm: 72165.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5158/ 159576 | consumed samples: 129456 | elapsed time per iteration (ms): 15536.3 | learning rate: 3.582E-05 | global batch size: 48 | lm loss: 6.398818E+00 | loss scale: 4096.0 | grad norm: 69046.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5159/ 159576 | consumed samples: 129504 | elapsed time per iteration (ms): 15739.5 | learning rate: 3.584E-05 | global batch size: 48 | lm loss: 6.384636E+00 | loss scale: 4096.0 | grad norm: 65721.319 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5160/ 159576 | consumed samples: 129552 | elapsed time per iteration (ms): 15530.3 | learning rate: 3.585E-05 | global batch size: 48 | lm loss: 6.340583E+00 | loss scale: 4096.0 | grad norm: 70984.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5161/ 159576 | consumed samples: 129600 | elapsed time per iteration (ms): 15537.1 | learning rate: 3.586E-05 | global batch size: 48 | lm loss: 6.299366E+00 | loss scale: 4096.0 | grad norm: 120531.429 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5162/ 159576 | consumed samples: 129648 | elapsed time per iteration (ms): 15525.1 | learning rate: 3.588E-05 | global batch size: 48 | lm loss: 6.422726E+00 | loss scale: 4096.0 | grad norm: 80943.603 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5163/ 159576 | consumed samples: 129696 | elapsed time per iteration (ms): 15737.7 | learning rate: 3.589E-05 | global batch size: 48 | lm loss: 6.343781E+00 | loss scale: 4096.0 | grad norm: 62800.221 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5164/ 159576 | consumed samples: 129744 | elapsed time per iteration (ms): 15570.2 | learning rate: 3.590E-05 | global batch size: 48 | lm loss: 6.478961E+00 | loss scale: 4096.0 | grad norm: 49279.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5165/ 159576 | consumed samples: 129792 | elapsed time per iteration (ms): 15467.9 | learning rate: 3.592E-05 | global batch size: 48 | lm loss: 6.465704E+00 | loss scale: 4096.0 | grad norm: 56608.697 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5166/ 159576 | consumed samples: 129840 | elapsed time per iteration (ms): 15511.0 | learning rate: 3.593E-05 | global batch size: 48 | lm loss: 6.389446E+00 | loss scale: 4096.0 | grad norm: 64287.210 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5167/ 159576 | consumed samples: 129888 | elapsed time per iteration (ms): 15650.0 | learning rate: 3.594E-05 | global batch size: 48 | lm loss: 6.432152E+00 | loss scale: 4096.0 | grad norm: 68389.100 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5168/ 159576 | consumed samples: 129936 | elapsed time per iteration (ms): 15501.5 | learning rate: 3.596E-05 | global batch size: 48 | lm loss: 6.311705E+00 | loss scale: 4096.0 | grad norm: 60127.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5169/ 159576 | consumed samples: 129984 | elapsed time per iteration (ms): 15500.0 | learning rate: 3.597E-05 | global batch size: 48 | lm loss: 6.459386E+00 | loss scale: 4096.0 | grad norm: 193850.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5170/ 159576 | consumed samples: 130032 | elapsed time per iteration (ms): 15853.5 | learning rate: 3.598E-05 | global batch size: 48 | lm loss: 6.359794E+00 | loss scale: 4096.0 | grad norm: 201400.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5171/ 159576 | consumed samples: 130080 | elapsed time per iteration (ms): 15565.6 | learning rate: 3.600E-05 | global batch size: 48 | lm loss: 6.447841E+00 | loss scale: 4096.0 | grad norm: 60758.011 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5172/ 159576 | consumed samples: 130128 | elapsed time per iteration (ms): 15439.0 | learning rate: 3.601E-05 | global batch size: 48 | lm loss: 6.390144E+00 | loss scale: 4096.0 | grad norm: 60173.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5173/ 159576 | consumed samples: 130176 | elapsed time per iteration (ms): 15512.4 | learning rate: 3.602E-05 | global batch size: 48 | lm loss: 6.471553E+00 | loss scale: 4096.0 | grad norm: 65209.828 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5174/ 159576 | consumed samples: 130224 | elapsed time per iteration (ms): 15753.1 | learning rate: 3.604E-05 | global batch size: 48 | lm loss: 6.363354E+00 | loss scale: 4096.0 | grad norm: 66471.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5175/ 159576 | consumed samples: 130272 | elapsed time per iteration (ms): 15415.5 | learning rate: 3.605E-05 | global batch size: 48 | lm loss: 6.418964E+00 | loss scale: 4096.0 | grad norm: 63654.751 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5176/ 159576 | consumed samples: 130320 | elapsed time per iteration (ms): 15469.1 | learning rate: 3.606E-05 | global batch size: 48 | lm loss: 6.357801E+00 | loss scale: 4096.0 | grad norm: 82288.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5177/ 159576 | consumed samples: 130368 | elapsed time per iteration (ms): 15407.1 | learning rate: 3.608E-05 | global batch size: 48 | lm loss: 6.479723E+00 | loss scale: 4096.0 | grad norm: 63508.625 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5178/ 159576 | consumed samples: 130416 | elapsed time per iteration (ms): 15785.1 | learning rate: 3.609E-05 | global batch size: 48 | lm loss: 6.532706E+00 | loss scale: 4096.0 | grad norm: 62734.072 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5179/ 159576 | consumed samples: 130464 | elapsed time per iteration (ms): 15467.8 | learning rate: 3.610E-05 | global batch size: 48 | lm loss: 6.442670E+00 | loss scale: 4096.0 | grad norm: 64963.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5180/ 159576 | consumed samples: 130512 | elapsed time per iteration (ms): 15479.5 | learning rate: 3.612E-05 | global batch size: 48 | lm loss: 6.373410E+00 | loss scale: 4096.0 | grad norm: 62492.194 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5181/ 159576 | consumed samples: 130560 | elapsed time per iteration (ms): 15413.5 | learning rate: 3.613E-05 | global batch size: 48 | lm loss: 6.442731E+00 | loss scale: 4096.0 | grad norm: 93654.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5182/ 159576 | consumed samples: 130608 | elapsed time per iteration (ms): 15788.0 | learning rate: 3.614E-05 | global batch size: 48 | lm loss: 6.356236E+00 | loss scale: 4096.0 | grad norm: 77133.068 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5183/ 159576 | consumed samples: 130656 | elapsed time per iteration (ms): 15436.5 | learning rate: 3.616E-05 | global batch size: 48 | lm loss: 6.321268E+00 | loss scale: 4096.0 | grad norm: 138010.507 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5184/ 159576 | consumed samples: 130704 | elapsed time per iteration (ms): 15417.0 | learning rate: 3.617E-05 | global batch size: 48 | lm loss: 6.463357E+00 | loss scale: 4096.0 | grad norm: 67977.572 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5185/ 159576 | consumed samples: 130752 | elapsed time per iteration (ms): 15399.1 | learning rate: 3.618E-05 | global batch size: 48 | lm loss: 6.369720E+00 | loss scale: 4096.0 | grad norm: 73939.997 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5186/ 159576 | consumed samples: 130800 | elapsed time per iteration (ms): 15682.4 | learning rate: 3.620E-05 | global batch size: 48 | lm loss: 6.404753E+00 | loss scale: 4096.0 | grad norm: 71441.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5187/ 159576 | consumed samples: 130848 | elapsed time per iteration (ms): 15500.0 | learning rate: 3.621E-05 | global batch size: 48 | lm loss: 6.418368E+00 | loss scale: 4096.0 | grad norm: 85130.256 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5188/ 159576 | consumed samples: 130896 | elapsed time per iteration (ms): 15437.0 | learning rate: 3.622E-05 | global batch size: 48 | lm loss: 6.391647E+00 | loss scale: 4096.0 | grad norm: 66283.229 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5189/ 159576 | consumed samples: 130944 | elapsed time per iteration (ms): 15475.7 | learning rate: 3.624E-05 | global batch size: 48 | lm loss: 6.322616E+00 | loss scale: 4096.0 | grad norm: 75047.649 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5190/ 159576 | consumed samples: 130992 | elapsed time per iteration (ms): 15579.8 | learning rate: 3.625E-05 | global batch size: 48 | lm loss: 6.431418E+00 | loss scale: 4096.0 | grad norm: 58908.817 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5191/ 159576 | consumed samples: 131040 | elapsed time per iteration (ms): 15429.7 | learning rate: 3.626E-05 | global batch size: 48 | lm loss: 6.535919E+00 | loss scale: 4096.0 | grad norm: 122859.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5192/ 159576 | consumed samples: 131088 | elapsed time per iteration (ms): 15437.2 | learning rate: 3.628E-05 | global batch size: 48 | lm loss: 6.220134E+00 | loss scale: 4096.0 | grad norm: 92437.561 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5193/ 159576 | consumed samples: 131136 | elapsed time per iteration (ms): 15429.8 | learning rate: 3.629E-05 | global batch size: 48 | lm loss: 6.373948E+00 | loss scale: 4096.0 | grad norm: 93116.737 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5194/ 159576 | consumed samples: 131184 | elapsed time per iteration (ms): 15588.8 | learning rate: 3.630E-05 | global batch size: 48 | lm loss: 6.390661E+00 | loss scale: 4096.0 | grad norm: 64520.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5195/ 159576 | consumed samples: 131232 | elapsed time per iteration (ms): 15414.6 | learning rate: 3.632E-05 | global batch size: 48 | lm loss: 6.359470E+00 | loss scale: 4096.0 | grad norm: 61039.424 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5196/ 159576 | consumed samples: 131280 | elapsed time per iteration (ms): 15469.0 | learning rate: 3.633E-05 | global batch size: 48 | lm loss: 6.426967E+00 | loss scale: 4096.0 | grad norm: 69860.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5197/ 159576 | consumed samples: 131328 | elapsed time per iteration (ms): 15399.3 | learning rate: 3.634E-05 | global batch size: 48 | lm loss: 6.397369E+00 | loss scale: 4096.0 | grad norm: 67025.925 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5198/ 159576 | consumed samples: 131376 | elapsed time per iteration (ms): 15852.9 | learning rate: 3.636E-05 | global batch size: 48 | lm loss: 6.470811E+00 | loss scale: 4096.0 | grad norm: 94172.614 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5199/ 159576 | consumed samples: 131424 | elapsed time per iteration (ms): 15428.8 | learning rate: 3.637E-05 | global batch size: 48 | lm loss: 6.341267E+00 | loss scale: 4096.0 | grad norm: 73918.814 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5200/ 159576 | consumed samples: 131472 | elapsed time per iteration (ms): 15444.1 | learning rate: 3.638E-05 | global batch size: 48 | lm loss: 6.434019E+00 | loss scale: 4096.0 | grad norm: 107373.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5201/ 159576 | consumed samples: 131520 | elapsed time per iteration (ms): 15807.8 | learning rate: 3.639E-05 | global batch size: 48 | lm loss: 6.288959E+00 | loss scale: 4096.0 | grad norm: 60538.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5202/ 159576 | consumed samples: 131568 | elapsed time per iteration (ms): 15428.1 | learning rate: 3.641E-05 | global batch size: 48 | lm loss: 6.382991E+00 | loss scale: 4096.0 | grad norm: 87744.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5203/ 159576 | consumed samples: 131616 | elapsed time per iteration (ms): 15473.7 | learning rate: 3.642E-05 | global batch size: 48 | lm loss: 6.421006E+00 | loss scale: 4096.0 | grad norm: 63743.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5204/ 159576 | consumed samples: 131664 | elapsed time per iteration (ms): 15342.5 | learning rate: 3.643E-05 | global batch size: 48 | lm loss: 6.345580E+00 | loss scale: 4096.0 | grad norm: 83317.459 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5205/ 159576 | consumed samples: 131712 | elapsed time per iteration (ms): 15751.6 | learning rate: 3.645E-05 | global batch size: 48 | lm loss: 6.379266E+00 | loss scale: 4096.0 | grad norm: 72285.964 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5206/ 159576 | consumed samples: 131760 | elapsed time per iteration (ms): 15391.2 | learning rate: 3.646E-05 | global batch size: 48 | lm loss: 6.296494E+00 | loss scale: 4096.0 | grad norm: 99774.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5207/ 159576 | consumed samples: 131808 | elapsed time per iteration (ms): 15463.8 | learning rate: 3.647E-05 | global batch size: 48 | lm loss: 6.419320E+00 | loss scale: 4096.0 | grad norm: 76787.605 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5208/ 159576 | consumed samples: 131856 | elapsed time per iteration (ms): 15457.9 | learning rate: 3.649E-05 | global batch size: 48 | lm loss: 6.321754E+00 | loss scale: 4096.0 | grad norm: 71044.606 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5209/ 159576 | consumed samples: 131904 | elapsed time per iteration (ms): 15812.3 | learning rate: 3.650E-05 | global batch size: 48 | lm loss: 6.295812E+00 | loss scale: 4096.0 | grad norm: 80278.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5210/ 159576 | consumed samples: 131952 | elapsed time per iteration (ms): 15416.3 | learning rate: 3.651E-05 | global batch size: 48 | lm loss: 6.444015E+00 | loss scale: 4096.0 | grad norm: 69086.077 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5211/ 159576 | consumed samples: 132000 | elapsed time per iteration (ms): 15496.5 | learning rate: 3.653E-05 | global batch size: 48 | lm loss: 6.426943E+00 | loss scale: 4096.0 | grad norm: 87922.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5212/ 159576 | consumed samples: 132048 | elapsed time per iteration (ms): 15327.0 | learning rate: 3.654E-05 | global batch size: 48 | lm loss: 6.361041E+00 | loss scale: 4096.0 | grad norm: 68686.112 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5213/ 159576 | consumed samples: 132096 | elapsed time per iteration (ms): 15936.5 | learning rate: 3.655E-05 | global batch size: 48 | lm loss: 6.389860E+00 | loss scale: 4096.0 | grad norm: 68529.242 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5214/ 159576 | consumed samples: 132144 | elapsed time per iteration (ms): 15542.2 | learning rate: 3.657E-05 | global batch size: 48 | lm loss: 6.395509E+00 | loss scale: 4096.0 | grad norm: 66332.216 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5215/ 159576 | consumed samples: 132192 | elapsed time per iteration (ms): 15481.3 | learning rate: 3.658E-05 | global batch size: 48 | lm loss: 6.378184E+00 | loss scale: 4096.0 | grad norm: 69005.077 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5216/ 159576 | consumed samples: 132240 | elapsed time per iteration (ms): 15471.0 | learning rate: 3.659E-05 | global batch size: 48 | lm loss: 6.409903E+00 | loss scale: 4096.0 | grad norm: 78238.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5217/ 159576 | consumed samples: 132288 | elapsed time per iteration (ms): 15765.5 | learning rate: 3.661E-05 | global batch size: 48 | lm loss: 6.468248E+00 | loss scale: 4096.0 | grad norm: 81260.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5218/ 159576 | consumed samples: 132336 | elapsed time per iteration (ms): 15514.7 | learning rate: 3.662E-05 | global batch size: 48 | lm loss: 6.462075E+00 | loss scale: 4096.0 | grad norm: 89591.763 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5219/ 159576 | consumed samples: 132384 | elapsed time per iteration (ms): 15488.0 | learning rate: 3.663E-05 | global batch size: 48 | lm loss: 6.402821E+00 | loss scale: 4096.0 | grad norm: 67243.019 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5220/ 159576 | consumed samples: 132432 | elapsed time per iteration (ms): 15443.2 | learning rate: 3.665E-05 | global batch size: 48 | lm loss: 6.377299E+00 | loss scale: 4096.0 | grad norm: 73909.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5221/ 159576 | consumed samples: 132480 | elapsed time per iteration (ms): 15695.0 | learning rate: 3.666E-05 | global batch size: 48 | lm loss: 6.451472E+00 | loss scale: 4096.0 | grad norm: 66658.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5222/ 159576 | consumed samples: 132528 | elapsed time per iteration (ms): 15480.5 | learning rate: 3.667E-05 | global batch size: 48 | lm loss: 6.465474E+00 | loss scale: 4096.0 | grad norm: 71303.345 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5223/ 159576 | consumed samples: 132576 | elapsed time per iteration (ms): 15538.4 | learning rate: 3.669E-05 | global batch size: 48 | lm loss: 6.452018E+00 | loss scale: 4096.0 | grad norm: 61632.620 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5224/ 159576 | consumed samples: 132624 | elapsed time per iteration (ms): 15433.6 | learning rate: 3.670E-05 | global batch size: 48 | lm loss: 6.417565E+00 | loss scale: 4096.0 | grad norm: 99052.706 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5225/ 159576 | consumed samples: 132672 | elapsed time per iteration (ms): 16019.0 | learning rate: 3.671E-05 | global batch size: 48 | lm loss: 6.392467E+00 | loss scale: 4096.0 | grad norm: 81901.168 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5226/ 159576 | consumed samples: 132720 | elapsed time per iteration (ms): 15479.0 | learning rate: 3.673E-05 | global batch size: 48 | lm loss: 6.432102E+00 | loss scale: 4096.0 | grad norm: 80603.914 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5227/ 159576 | consumed samples: 132768 | elapsed time per iteration (ms): 15499.4 | learning rate: 3.674E-05 | global batch size: 48 | lm loss: 6.304895E+00 | loss scale: 4096.0 | grad norm: 63916.075 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5228/ 159576 | consumed samples: 132816 | elapsed time per iteration (ms): 15774.2 | learning rate: 3.675E-05 | global batch size: 48 | lm loss: 6.323613E+00 | loss scale: 4096.0 | grad norm: 76694.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5229/ 159576 | consumed samples: 132864 | elapsed time per iteration (ms): 15599.1 | learning rate: 3.677E-05 | global batch size: 48 | lm loss: 6.488564E+00 | loss scale: 4096.0 | grad norm: 76280.931 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5230/ 159576 | consumed samples: 132912 | elapsed time per iteration (ms): 15549.2 | learning rate: 3.678E-05 | global batch size: 48 | lm loss: 6.430355E+00 | loss scale: 4096.0 | grad norm: 71462.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5231/ 159576 | consumed samples: 132960 | elapsed time per iteration (ms): 15501.3 | learning rate: 3.679E-05 | global batch size: 48 | lm loss: 6.493622E+00 | loss scale: 4096.0 | grad norm: 59853.872 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5232/ 159576 | consumed samples: 133008 | elapsed time per iteration (ms): 15779.3 | learning rate: 3.681E-05 | global batch size: 48 | lm loss: 6.284019E+00 | loss scale: 4096.0 | grad norm: 69496.678 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5233/ 159576 | consumed samples: 133056 | elapsed time per iteration (ms): 15428.5 | learning rate: 3.682E-05 | global batch size: 48 | lm loss: 6.267179E+00 | loss scale: 4096.0 | grad norm: 63245.018 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5234/ 159576 | consumed samples: 133104 | elapsed time per iteration (ms): 15461.3 | learning rate: 3.683E-05 | global batch size: 48 | lm loss: 6.449612E+00 | loss scale: 4096.0 | grad norm: 78199.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5235/ 159576 | consumed samples: 133152 | elapsed time per iteration (ms): 15485.3 | learning rate: 3.685E-05 | global batch size: 48 | lm loss: 6.443536E+00 | loss scale: 4096.0 | grad norm: 70168.271 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5236/ 159576 | consumed samples: 133200 | elapsed time per iteration (ms): 15933.7 | learning rate: 3.686E-05 | global batch size: 48 | lm loss: 6.244983E+00 | loss scale: 4096.0 | grad norm: 75166.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5237/ 159576 | consumed samples: 133248 | elapsed time per iteration (ms): 15418.0 | learning rate: 3.687E-05 | global batch size: 48 | lm loss: 6.283341E+00 | loss scale: 4096.0 | grad norm: 72463.714 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5238/ 159576 | consumed samples: 133296 | elapsed time per iteration (ms): 15549.2 | learning rate: 3.689E-05 | global batch size: 48 | lm loss: 6.438685E+00 | loss scale: 4096.0 | grad norm: 82352.679 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5239/ 159576 | consumed samples: 133344 | elapsed time per iteration (ms): 15537.2 | learning rate: 3.690E-05 | global batch size: 48 | lm loss: 6.362652E+00 | loss scale: 4096.0 | grad norm: 70918.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5240/ 159576 | consumed samples: 133392 | elapsed time per iteration (ms): 15840.0 | learning rate: 3.691E-05 | global batch size: 48 | lm loss: 6.368175E+00 | loss scale: 4096.0 | grad norm: 155104.639 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5241/ 159576 | consumed samples: 133440 | elapsed time per iteration (ms): 15490.2 | learning rate: 3.693E-05 | global batch size: 48 | lm loss: 6.400668E+00 | loss scale: 4096.0 | grad norm: 68076.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5242/ 159576 | consumed samples: 133488 | elapsed time per iteration (ms): 15382.4 | learning rate: 3.694E-05 | global batch size: 48 | lm loss: 6.316941E+00 | loss scale: 4096.0 | grad norm: 57901.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5243/ 159576 | consumed samples: 133536 | elapsed time per iteration (ms): 15382.2 | learning rate: 3.695E-05 | global batch size: 48 | lm loss: 6.494829E+00 | loss scale: 4096.0 | grad norm: 62287.898 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5244/ 159576 | consumed samples: 133584 | elapsed time per iteration (ms): 15661.6 | learning rate: 3.697E-05 | global batch size: 48 | lm loss: 6.397869E+00 | loss scale: 4096.0 | grad norm: 57367.212 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5245/ 159576 | consumed samples: 133632 | elapsed time per iteration (ms): 15495.8 | learning rate: 3.698E-05 | global batch size: 48 | lm loss: 6.256347E+00 | loss scale: 4096.0 | grad norm: 61800.740 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5246/ 159576 | consumed samples: 133680 | elapsed time per iteration (ms): 15523.0 | learning rate: 3.699E-05 | global batch size: 48 | lm loss: 6.389894E+00 | loss scale: 4096.0 | grad norm: 69126.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5247/ 159576 | consumed samples: 133728 | elapsed time per iteration (ms): 15546.9 | learning rate: 3.701E-05 | global batch size: 48 | lm loss: 6.346736E+00 | loss scale: 4096.0 | grad norm: 67046.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5248/ 159576 | consumed samples: 133776 | elapsed time per iteration (ms): 15650.8 | learning rate: 3.702E-05 | global batch size: 48 | lm loss: 6.430111E+00 | loss scale: 4096.0 | grad norm: 69265.741 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5249/ 159576 | consumed samples: 133824 | elapsed time per iteration (ms): 15490.3 | learning rate: 3.703E-05 | global batch size: 48 | lm loss: 6.390760E+00 | loss scale: 4096.0 | grad norm: 142507.174 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5250/ 159576 | consumed samples: 133872 | elapsed time per iteration (ms): 15521.8 | learning rate: 3.705E-05 | global batch size: 48 | lm loss: 6.420756E+00 | loss scale: 4096.0 | grad norm: 64815.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5251/ 159576 | consumed samples: 133920 | elapsed time per iteration (ms): 15759.8 | learning rate: 3.706E-05 | global batch size: 48 | lm loss: 6.360211E+00 | loss scale: 4096.0 | grad norm: 82700.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5252/ 159576 | consumed samples: 133968 | elapsed time per iteration (ms): 15623.8 | learning rate: 3.707E-05 | global batch size: 48 | lm loss: 6.351006E+00 | loss scale: 4096.0 | grad norm: 70030.998 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-24 23:07:37] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-24 23:07:37] PULSE: tr8-104B is running for 17:15:26 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 5253/ 159576 | consumed samples: 134016 | elapsed time per iteration (ms): 15553.2 | learning rate: 3.709E-05 | global batch size: 48 | lm loss: 6.395989E+00 | loss scale: 4096.0 | grad norm: 75934.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5254/ 159576 | consumed samples: 134064 | elapsed time per iteration (ms): 15521.6 | learning rate: 3.710E-05 | global batch size: 48 | lm loss: 6.388237E+00 | loss scale: 4096.0 | grad norm: 85225.047 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5255/ 159576 | consumed samples: 134112 | elapsed time per iteration (ms): 15886.3 | learning rate: 3.711E-05 | global batch size: 48 | lm loss: 6.348703E+00 | loss scale: 4096.0 | grad norm: 72802.836 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5256/ 159576 | consumed samples: 134160 | elapsed time per iteration (ms): 15520.3 | learning rate: 3.713E-05 | global batch size: 48 | lm loss: 6.321572E+00 | loss scale: 4096.0 | grad norm: 73245.874 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5257/ 159576 | consumed samples: 134208 | elapsed time per iteration (ms): 15443.7 | learning rate: 3.714E-05 | global batch size: 48 | lm loss: 6.335665E+00 | loss scale: 4096.0 | grad norm: 58798.760 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5258/ 159576 | consumed samples: 134256 | elapsed time per iteration (ms): 15427.0 | learning rate: 3.715E-05 | global batch size: 48 | lm loss: 6.319070E+00 | loss scale: 4096.0 | grad norm: 66591.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5259/ 159576 | consumed samples: 134304 | elapsed time per iteration (ms): 15760.6 | learning rate: 3.717E-05 | global batch size: 48 | lm loss: 6.229961E+00 | loss scale: 4096.0 | grad norm: 78411.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5260/ 159576 | consumed samples: 134352 | elapsed time per iteration (ms): 15544.0 | learning rate: 3.718E-05 | global batch size: 48 | lm loss: 6.379896E+00 | loss scale: 4096.0 | grad norm: 82294.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5261/ 159576 | consumed samples: 134400 | elapsed time per iteration (ms): 15397.8 | learning rate: 3.719E-05 | global batch size: 48 | lm loss: 6.233184E+00 | loss scale: 4096.0 | grad norm: 65525.586 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5262/ 159576 | consumed samples: 134448 | elapsed time per iteration (ms): 15498.3 | learning rate: 3.721E-05 | global batch size: 48 | lm loss: 6.326461E+00 | loss scale: 4096.0 | grad norm: 101232.286 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5263/ 159576 | consumed samples: 134496 | elapsed time per iteration (ms): 15834.8 | learning rate: 3.722E-05 | global batch size: 48 | lm loss: 6.351873E+00 | loss scale: 4096.0 | grad norm: 82652.498 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5264/ 159576 | consumed samples: 134544 | elapsed time per iteration (ms): 15450.4 | learning rate: 3.723E-05 | global batch size: 48 | lm loss: 6.411518E+00 | loss scale: 4096.0 | grad norm: 79704.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5265/ 159576 | consumed samples: 134592 | elapsed time per iteration (ms): 15408.5 | learning rate: 3.725E-05 | global batch size: 48 | lm loss: 6.324855E+00 | loss scale: 4096.0 | grad norm: 96783.723 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5266/ 159576 | consumed samples: 134640 | elapsed time per iteration (ms): 15369.4 | learning rate: 3.726E-05 | global batch size: 48 | lm loss: 6.351592E+00 | loss scale: 4096.0 | grad norm: 96231.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5267/ 159576 | consumed samples: 134688 | elapsed time per iteration (ms): 15643.8 | learning rate: 3.727E-05 | global batch size: 48 | lm loss: 6.439371E+00 | loss scale: 4096.0 | grad norm: 86165.942 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5268/ 159576 | consumed samples: 134736 | elapsed time per iteration (ms): 15428.0 | learning rate: 3.729E-05 | global batch size: 48 | lm loss: 6.282881E+00 | loss scale: 4096.0 | grad norm: 95370.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5269/ 159576 | consumed samples: 134784 | elapsed time per iteration (ms): 15422.7 | learning rate: 3.730E-05 | global batch size: 48 | lm loss: 6.489480E+00 | loss scale: 4096.0 | grad norm: 77407.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5270/ 159576 | consumed samples: 134832 | elapsed time per iteration (ms): 15384.0 | learning rate: 3.731E-05 | global batch size: 48 | lm loss: 6.382200E+00 | loss scale: 4096.0 | grad norm: 66716.315 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5271/ 159576 | consumed samples: 134880 | elapsed time per iteration (ms): 15581.8 | learning rate: 3.733E-05 | global batch size: 48 | lm loss: 6.409722E+00 | loss scale: 4096.0 | grad norm: 68218.526 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5272/ 159576 | consumed samples: 134928 | elapsed time per iteration (ms): 15395.7 | learning rate: 3.734E-05 | global batch size: 48 | lm loss: 6.493249E+00 | loss scale: 4096.0 | grad norm: 71580.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5273/ 159576 | consumed samples: 134976 | elapsed time per iteration (ms): 15402.4 | learning rate: 3.735E-05 | global batch size: 48 | lm loss: 6.376624E+00 | loss scale: 4096.0 | grad norm: 85075.910 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5274/ 159576 | consumed samples: 135024 | elapsed time per iteration (ms): 15424.2 | learning rate: 3.737E-05 | global batch size: 48 | lm loss: 6.441435E+00 | loss scale: 4096.0 | grad norm: 75286.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5275/ 159576 | consumed samples: 135072 | elapsed time per iteration (ms): 15616.5 | learning rate: 3.738E-05 | global batch size: 48 | lm loss: 6.428281E+00 | loss scale: 4096.0 | grad norm: 71317.497 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5276/ 159576 | consumed samples: 135120 | elapsed time per iteration (ms): 15383.8 | learning rate: 3.739E-05 | global batch size: 48 | lm loss: 6.324539E+00 | loss scale: 4096.0 | grad norm: 70509.208 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5277/ 159576 | consumed samples: 135168 | elapsed time per iteration (ms): 15404.4 | learning rate: 3.741E-05 | global batch size: 48 | lm loss: 6.396560E+00 | loss scale: 4096.0 | grad norm: 68223.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5278/ 159576 | consumed samples: 135216 | elapsed time per iteration (ms): 15464.0 | learning rate: 3.742E-05 | global batch size: 48 | lm loss: 6.403405E+00 | loss scale: 4096.0 | grad norm: 74828.040 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5279/ 159576 | consumed samples: 135264 | elapsed time per iteration (ms): 15572.0 | learning rate: 3.743E-05 | global batch size: 48 | lm loss: 6.340907E+00 | loss scale: 4096.0 | grad norm: 103719.466 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5280/ 159576 | consumed samples: 135312 | elapsed time per iteration (ms): 15390.1 | learning rate: 3.745E-05 | global batch size: 48 | lm loss: 6.465801E+00 | loss scale: 4096.0 | grad norm: 71954.053 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5281/ 159576 | consumed samples: 135360 | elapsed time per iteration (ms): 15379.3 | learning rate: 3.746E-05 | global batch size: 48 | lm loss: 6.481463E+00 | loss scale: 4096.0 | grad norm: 64156.580 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5282/ 159576 | consumed samples: 135408 | elapsed time per iteration (ms): 15880.0 | learning rate: 3.747E-05 | global batch size: 48 | lm loss: 6.324627E+00 | loss scale: 4096.0 | grad norm: 77974.806 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5283/ 159576 | consumed samples: 135456 | elapsed time per iteration (ms): 15461.2 | learning rate: 3.749E-05 | global batch size: 48 | lm loss: 6.278036E+00 | loss scale: 4096.0 | grad norm: 78417.449 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5284/ 159576 | consumed samples: 135504 | elapsed time per iteration (ms): 15434.3 | learning rate: 3.750E-05 | global batch size: 48 | lm loss: 6.470399E+00 | loss scale: 4096.0 | grad norm: 70677.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5285/ 159576 | consumed samples: 135552 | elapsed time per iteration (ms): 15453.3 | learning rate: 3.751E-05 | global batch size: 48 | lm loss: 6.465354E+00 | loss scale: 4096.0 | grad norm: 72699.042 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5286/ 159576 | consumed samples: 135600 | elapsed time per iteration (ms): 15799.4 | learning rate: 3.753E-05 | global batch size: 48 | lm loss: 6.366466E+00 | loss scale: 4096.0 | grad norm: 87890.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5287/ 159576 | consumed samples: 135648 | elapsed time per iteration (ms): 15462.6 | learning rate: 3.754E-05 | global batch size: 48 | lm loss: 6.450302E+00 | loss scale: 4096.0 | grad norm: 65500.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5288/ 159576 | consumed samples: 135696 | elapsed time per iteration (ms): 15449.3 | learning rate: 3.755E-05 | global batch size: 48 | lm loss: 6.211058E+00 | loss scale: 4096.0 | grad norm: 91309.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5289/ 159576 | consumed samples: 135744 | elapsed time per iteration (ms): 15440.0 | learning rate: 3.757E-05 | global batch size: 48 | lm loss: 6.439297E+00 | loss scale: 4096.0 | grad norm: 78139.415 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5290/ 159576 | consumed samples: 135792 | elapsed time per iteration (ms): 15759.6 | learning rate: 3.758E-05 | global batch size: 48 | lm loss: 6.295393E+00 | loss scale: 4096.0 | grad norm: 67343.216 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5291/ 159576 | consumed samples: 135840 | elapsed time per iteration (ms): 15513.6 | learning rate: 3.759E-05 | global batch size: 48 | lm loss: 6.403075E+00 | loss scale: 4096.0 | grad norm: 88227.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5292/ 159576 | consumed samples: 135888 | elapsed time per iteration (ms): 15421.3 | learning rate: 3.761E-05 | global batch size: 48 | lm loss: 6.414333E+00 | loss scale: 4096.0 | grad norm: 78788.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5293/ 159576 | consumed samples: 135936 | elapsed time per iteration (ms): 15345.3 | learning rate: 3.762E-05 | global batch size: 48 | lm loss: 6.292488E+00 | loss scale: 4096.0 | grad norm: 59708.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5294/ 159576 | consumed samples: 135984 | elapsed time per iteration (ms): 16027.7 | learning rate: 3.763E-05 | global batch size: 48 | lm loss: 6.385753E+00 | loss scale: 4096.0 | grad norm: 102775.204 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5295/ 159576 | consumed samples: 136032 | elapsed time per iteration (ms): 15461.5 | learning rate: 3.765E-05 | global batch size: 48 | lm loss: 6.324437E+00 | loss scale: 4096.0 | grad norm: 71697.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5296/ 159576 | consumed samples: 136080 | elapsed time per iteration (ms): 15433.9 | learning rate: 3.766E-05 | global batch size: 48 | lm loss: 6.384956E+00 | loss scale: 4096.0 | grad norm: 102953.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5297/ 159576 | consumed samples: 136128 | elapsed time per iteration (ms): 15429.7 | learning rate: 3.767E-05 | global batch size: 48 | lm loss: 6.436825E+00 | loss scale: 4096.0 | grad norm: 75031.086 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5298/ 159576 | consumed samples: 136176 | elapsed time per iteration (ms): 15818.4 | learning rate: 3.769E-05 | global batch size: 48 | lm loss: 6.482272E+00 | loss scale: 4096.0 | grad norm: 65276.986 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5299/ 159576 | consumed samples: 136224 | elapsed time per iteration (ms): 15441.5 | learning rate: 3.770E-05 | global batch size: 48 | lm loss: 6.589076E+00 | loss scale: 4096.0 | grad norm: 121561.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5300/ 159576 | consumed samples: 136272 | elapsed time per iteration (ms): 15422.2 | learning rate: 3.771E-05 | global batch size: 48 | lm loss: 6.405668E+00 | loss scale: 4096.0 | grad norm: 62093.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5301/ 159576 | consumed samples: 136320 | elapsed time per iteration (ms): 15355.0 | learning rate: 3.773E-05 | global batch size: 48 | lm loss: 6.390646E+00 | loss scale: 4096.0 | grad norm: 56038.998 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5302/ 159576 | consumed samples: 136368 | elapsed time per iteration (ms): 15565.3 | learning rate: 3.774E-05 | global batch size: 48 | lm loss: 6.410752E+00 | loss scale: 4096.0 | grad norm: 64581.105 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5303/ 159576 | consumed samples: 136416 | elapsed time per iteration (ms): 15422.3 | learning rate: 3.775E-05 | global batch size: 48 | lm loss: 6.448494E+00 | loss scale: 4096.0 | grad norm: 77740.769 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5304/ 159576 | consumed samples: 136464 | elapsed time per iteration (ms): 15454.6 | learning rate: 3.777E-05 | global batch size: 48 | lm loss: 6.436998E+00 | loss scale: 4096.0 | grad norm: 86587.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5305/ 159576 | consumed samples: 136512 | elapsed time per iteration (ms): 15410.7 | learning rate: 3.778E-05 | global batch size: 48 | lm loss: 6.360906E+00 | loss scale: 4096.0 | grad norm: 102483.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5306/ 159576 | consumed samples: 136560 | elapsed time per iteration (ms): 15590.5 | learning rate: 3.779E-05 | global batch size: 48 | lm loss: 6.449046E+00 | loss scale: 4096.0 | grad norm: 63898.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5307/ 159576 | consumed samples: 136608 | elapsed time per iteration (ms): 15506.8 | learning rate: 3.781E-05 | global batch size: 48 | lm loss: 6.467348E+00 | loss scale: 4096.0 | grad norm: 66863.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5308/ 159576 | consumed samples: 136656 | elapsed time per iteration (ms): 15351.0 | learning rate: 3.782E-05 | global batch size: 48 | lm loss: 6.301440E+00 | loss scale: 4096.0 | grad norm: 66038.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5309/ 159576 | consumed samples: 136704 | elapsed time per iteration (ms): 15547.1 | learning rate: 3.783E-05 | global batch size: 48 | lm loss: 6.314401E+00 | loss scale: 4096.0 | grad norm: 100622.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5310/ 159576 | consumed samples: 136752 | elapsed time per iteration (ms): 15714.1 | learning rate: 3.785E-05 | global batch size: 48 | lm loss: 6.474138E+00 | loss scale: 4096.0 | grad norm: 100713.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5311/ 159576 | consumed samples: 136800 | elapsed time per iteration (ms): 15441.4 | learning rate: 3.786E-05 | global batch size: 48 | lm loss: 6.429978E+00 | loss scale: 4096.0 | grad norm: 73118.420 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5312/ 159576 | consumed samples: 136848 | elapsed time per iteration (ms): 15448.2 | learning rate: 3.787E-05 | global batch size: 48 | lm loss: 6.322928E+00 | loss scale: 4096.0 | grad norm: 79244.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5313/ 159576 | consumed samples: 136896 | elapsed time per iteration (ms): 15801.3 | learning rate: 3.789E-05 | global batch size: 48 | lm loss: 6.536728E+00 | loss scale: 4096.0 | grad norm: 80004.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5314/ 159576 | consumed samples: 136944 | elapsed time per iteration (ms): 15420.7 | learning rate: 3.790E-05 | global batch size: 48 | lm loss: 6.358313E+00 | loss scale: 4096.0 | grad norm: 73656.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5315/ 159576 | consumed samples: 136992 | elapsed time per iteration (ms): 15430.5 | learning rate: 3.791E-05 | global batch size: 48 | lm loss: 6.285139E+00 | loss scale: 4096.0 | grad norm: 72555.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5316/ 159576 | consumed samples: 137040 | elapsed time per iteration (ms): 15418.3 | learning rate: 3.793E-05 | global batch size: 48 | lm loss: 6.355993E+00 | loss scale: 4096.0 | grad norm: 89604.868 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5317/ 159576 | consumed samples: 137088 | elapsed time per iteration (ms): 15767.6 | learning rate: 3.794E-05 | global batch size: 48 | lm loss: 6.370296E+00 | loss scale: 4096.0 | grad norm: 68760.061 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5318/ 159576 | consumed samples: 137136 | elapsed time per iteration (ms): 15469.0 | learning rate: 3.795E-05 | global batch size: 48 | lm loss: 6.401207E+00 | loss scale: 4096.0 | grad norm: 64825.425 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5319/ 159576 | consumed samples: 137184 | elapsed time per iteration (ms): 15469.4 | learning rate: 3.797E-05 | global batch size: 48 | lm loss: 6.433188E+00 | loss scale: 4096.0 | grad norm: 75954.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5320/ 159576 | consumed samples: 137232 | elapsed time per iteration (ms): 15484.0 | learning rate: 3.798E-05 | global batch size: 48 | lm loss: 6.422481E+00 | loss scale: 4096.0 | grad norm: 85143.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5321/ 159576 | consumed samples: 137280 | elapsed time per iteration (ms): 15773.2 | learning rate: 3.799E-05 | global batch size: 48 | lm loss: 6.394318E+00 | loss scale: 4096.0 | grad norm: 81431.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5322/ 159576 | consumed samples: 137328 | elapsed time per iteration (ms): 15339.5 | learning rate: 3.801E-05 | global batch size: 48 | lm loss: 6.498918E+00 | loss scale: 4096.0 | grad norm: 76418.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5323/ 159576 | consumed samples: 137376 | elapsed time per iteration (ms): 15420.7 | learning rate: 3.802E-05 | global batch size: 48 | lm loss: 6.518599E+00 | loss scale: 4096.0 | grad norm: 71705.255 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5324/ 159576 | consumed samples: 137424 | elapsed time per iteration (ms): 15420.3 | learning rate: 3.803E-05 | global batch size: 48 | lm loss: 6.429631E+00 | loss scale: 4096.0 | grad norm: 57358.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5325/ 159576 | consumed samples: 137472 | elapsed time per iteration (ms): 15903.1 | learning rate: 3.805E-05 | global batch size: 48 | lm loss: 6.407781E+00 | loss scale: 4096.0 | grad norm: 91506.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5326/ 159576 | consumed samples: 137520 | elapsed time per iteration (ms): 15425.4 | learning rate: 3.806E-05 | global batch size: 48 | lm loss: 6.399868E+00 | loss scale: 4096.0 | grad norm: 68843.352 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5327/ 159576 | consumed samples: 137568 | elapsed time per iteration (ms): 15444.3 | learning rate: 3.807E-05 | global batch size: 48 | lm loss: 6.412372E+00 | loss scale: 4096.0 | grad norm: 67149.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5328/ 159576 | consumed samples: 137616 | elapsed time per iteration (ms): 15406.6 | learning rate: 3.809E-05 | global batch size: 48 | lm loss: 6.430699E+00 | loss scale: 4096.0 | grad norm: 102742.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5329/ 159576 | consumed samples: 137664 | elapsed time per iteration (ms): 15722.7 | learning rate: 3.810E-05 | global batch size: 48 | lm loss: 6.415520E+00 | loss scale: 4096.0 | grad norm: 73301.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5330/ 159576 | consumed samples: 137712 | elapsed time per iteration (ms): 15405.0 | learning rate: 3.811E-05 | global batch size: 48 | lm loss: 6.359590E+00 | loss scale: 4096.0 | grad norm: 70222.523 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5331/ 159576 | consumed samples: 137760 | elapsed time per iteration (ms): 15374.6 | learning rate: 3.813E-05 | global batch size: 48 | lm loss: 6.443409E+00 | loss scale: 4096.0 | grad norm: 79619.657 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5332/ 159576 | consumed samples: 137808 | elapsed time per iteration (ms): 15404.3 | learning rate: 3.814E-05 | global batch size: 48 | lm loss: 6.412749E+00 | loss scale: 4096.0 | grad norm: 110889.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5333/ 159576 | consumed samples: 137856 | elapsed time per iteration (ms): 15590.4 | learning rate: 3.815E-05 | global batch size: 48 | lm loss: 6.492513E+00 | loss scale: 4096.0 | grad norm: 80255.448 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5334/ 159576 | consumed samples: 137904 | elapsed time per iteration (ms): 15436.5 | learning rate: 3.817E-05 | global batch size: 48 | lm loss: 6.400149E+00 | loss scale: 4096.0 | grad norm: 69554.344 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5335/ 159576 | consumed samples: 137952 | elapsed time per iteration (ms): 15422.0 | learning rate: 3.818E-05 | global batch size: 48 | lm loss: 6.473186E+00 | loss scale: 4096.0 | grad norm: 96185.543 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5336/ 159576 | consumed samples: 138000 | elapsed time per iteration (ms): 15442.7 | learning rate: 3.819E-05 | global batch size: 48 | lm loss: 6.552884E+00 | loss scale: 4096.0 | grad norm: 73254.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5337/ 159576 | consumed samples: 138048 | elapsed time per iteration (ms): 15634.6 | learning rate: 3.821E-05 | global batch size: 48 | lm loss: 6.365612E+00 | loss scale: 4096.0 | grad norm: 57539.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5338/ 159576 | consumed samples: 138096 | elapsed time per iteration (ms): 15386.8 | learning rate: 3.822E-05 | global batch size: 48 | lm loss: 6.445109E+00 | loss scale: 4096.0 | grad norm: 67382.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5339/ 159576 | consumed samples: 138144 | elapsed time per iteration (ms): 15470.1 | learning rate: 3.823E-05 | global batch size: 48 | lm loss: 6.353713E+00 | loss scale: 4096.0 | grad norm: 110272.660 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5340/ 159576 | consumed samples: 138192 | elapsed time per iteration (ms): 15791.0 | learning rate: 3.825E-05 | global batch size: 48 | lm loss: 6.413539E+00 | loss scale: 4096.0 | grad norm: 72349.998 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5341/ 159576 | consumed samples: 138240 | elapsed time per iteration (ms): 15411.4 | learning rate: 3.826E-05 | global batch size: 48 | lm loss: 6.347322E+00 | loss scale: 4096.0 | grad norm: 61859.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5342/ 159576 | consumed samples: 138288 | elapsed time per iteration (ms): 15471.9 | learning rate: 3.827E-05 | global batch size: 48 | lm loss: 6.298682E+00 | loss scale: 4096.0 | grad norm: 78125.812 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5343/ 159576 | consumed samples: 138336 | elapsed time per iteration (ms): 15450.5 | learning rate: 3.829E-05 | global batch size: 48 | lm loss: 6.346509E+00 | loss scale: 4096.0 | grad norm: 76921.340 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5344/ 159576 | consumed samples: 138384 | elapsed time per iteration (ms): 15797.4 | learning rate: 3.830E-05 | global batch size: 48 | lm loss: 6.464560E+00 | loss scale: 4096.0 | grad norm: 73833.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5345/ 159576 | consumed samples: 138432 | elapsed time per iteration (ms): 15447.3 | learning rate: 3.831E-05 | global batch size: 48 | lm loss: 6.491942E+00 | loss scale: 4096.0 | grad norm: 58609.094 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5346/ 159576 | consumed samples: 138480 | elapsed time per iteration (ms): 15470.6 | learning rate: 3.833E-05 | global batch size: 48 | lm loss: 6.408776E+00 | loss scale: 4096.0 | grad norm: 61084.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5347/ 159576 | consumed samples: 138528 | elapsed time per iteration (ms): 15595.7 | learning rate: 3.834E-05 | global batch size: 48 | lm loss: 6.317072E+00 | loss scale: 4096.0 | grad norm: 79107.564 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5348/ 159576 | consumed samples: 138576 | elapsed time per iteration (ms): 15857.5 | learning rate: 3.835E-05 | global batch size: 48 | lm loss: 6.342214E+00 | loss scale: 4096.0 | grad norm: 82396.508 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5349/ 159576 | consumed samples: 138624 | elapsed time per iteration (ms): 15501.3 | learning rate: 3.837E-05 | global batch size: 48 | lm loss: 6.416060E+00 | loss scale: 4096.0 | grad norm: 58909.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5350/ 159576 | consumed samples: 138672 | elapsed time per iteration (ms): 15334.9 | learning rate: 3.838E-05 | global batch size: 48 | lm loss: 6.348287E+00 | loss scale: 4096.0 | grad norm: 54069.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5351/ 159576 | consumed samples: 138720 | elapsed time per iteration (ms): 15454.2 | learning rate: 3.839E-05 | global batch size: 48 | lm loss: 6.456007E+00 | loss scale: 4096.0 | grad norm: 61307.306 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5352/ 159576 | consumed samples: 138768 | elapsed time per iteration (ms): 15972.1 | learning rate: 3.841E-05 | global batch size: 48 | lm loss: 6.276731E+00 | loss scale: 4096.0 | grad norm: 62789.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5353/ 159576 | consumed samples: 138816 | elapsed time per iteration (ms): 15447.0 | learning rate: 3.842E-05 | global batch size: 48 | lm loss: 6.443192E+00 | loss scale: 4096.0 | grad norm: 75454.112 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5354/ 159576 | consumed samples: 138864 | elapsed time per iteration (ms): 15426.1 | learning rate: 3.843E-05 | global batch size: 48 | lm loss: 6.301665E+00 | loss scale: 4096.0 | grad norm: 66381.021 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5355/ 159576 | consumed samples: 138912 | elapsed time per iteration (ms): 15465.4 | learning rate: 3.845E-05 | global batch size: 48 | lm loss: 6.453572E+00 | loss scale: 4096.0 | grad norm: 63236.178 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5356/ 159576 | consumed samples: 138960 | elapsed time per iteration (ms): 15595.7 | learning rate: 3.846E-05 | global batch size: 48 | lm loss: 6.391494E+00 | loss scale: 4096.0 | grad norm: 78457.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5357/ 159576 | consumed samples: 139008 | elapsed time per iteration (ms): 15508.4 | learning rate: 3.847E-05 | global batch size: 48 | lm loss: 6.379974E+00 | loss scale: 4096.0 | grad norm: 85282.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5358/ 159576 | consumed samples: 139056 | elapsed time per iteration (ms): 15495.7 | learning rate: 3.849E-05 | global batch size: 48 | lm loss: 6.517261E+00 | loss scale: 4096.0 | grad norm: 75329.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5359/ 159576 | consumed samples: 139104 | elapsed time per iteration (ms): 15455.1 | learning rate: 3.850E-05 | global batch size: 48 | lm loss: 6.311386E+00 | loss scale: 4096.0 | grad norm: 74599.792 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5360/ 159576 | consumed samples: 139152 | elapsed time per iteration (ms): 15693.4 | learning rate: 3.851E-05 | global batch size: 48 | lm loss: 6.481428E+00 | loss scale: 4096.0 | grad norm: 77215.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5361/ 159576 | consumed samples: 139200 | elapsed time per iteration (ms): 15475.6 | learning rate: 3.853E-05 | global batch size: 48 | lm loss: 6.331719E+00 | loss scale: 4096.0 | grad norm: 60279.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5362/ 159576 | consumed samples: 139248 | elapsed time per iteration (ms): 15551.6 | learning rate: 3.854E-05 | global batch size: 48 | lm loss: 6.506707E+00 | loss scale: 4096.0 | grad norm: 57442.387 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5363/ 159576 | consumed samples: 139296 | elapsed time per iteration (ms): 15525.0 | learning rate: 3.855E-05 | global batch size: 48 | lm loss: 6.283090E+00 | loss scale: 4096.0 | grad norm: 69167.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5364/ 159576 | consumed samples: 139344 | elapsed time per iteration (ms): 15703.9 | learning rate: 3.857E-05 | global batch size: 48 | lm loss: 6.344968E+00 | loss scale: 4096.0 | grad norm: 66351.451 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5365/ 159576 | consumed samples: 139392 | elapsed time per iteration (ms): 15511.9 | learning rate: 3.858E-05 | global batch size: 48 | lm loss: 6.402239E+00 | loss scale: 4096.0 | grad norm: 69893.747 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5366/ 159576 | consumed samples: 139440 | elapsed time per iteration (ms): 15507.6 | learning rate: 3.859E-05 | global batch size: 48 | lm loss: 6.510591E+00 | loss scale: 4096.0 | grad norm: 73294.922 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5367/ 159576 | consumed samples: 139488 | elapsed time per iteration (ms): 15841.0 | learning rate: 3.861E-05 | global batch size: 48 | lm loss: 6.292207E+00 | loss scale: 4096.0 | grad norm: 69220.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5368/ 159576 | consumed samples: 139536 | elapsed time per iteration (ms): 15748.2 | learning rate: 3.862E-05 | global batch size: 48 | lm loss: 6.492587E+00 | loss scale: 4096.0 | grad norm: 78294.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5369/ 159576 | consumed samples: 139584 | elapsed time per iteration (ms): 15492.3 | learning rate: 3.863E-05 | global batch size: 48 | lm loss: 6.493845E+00 | loss scale: 4096.0 | grad norm: 94517.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5370/ 159576 | consumed samples: 139632 | elapsed time per iteration (ms): 15493.8 | learning rate: 3.864E-05 | global batch size: 48 | lm loss: 6.430061E+00 | loss scale: 4096.0 | grad norm: 77523.471 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5371/ 159576 | consumed samples: 139680 | elapsed time per iteration (ms): 15870.2 | learning rate: 3.866E-05 | global batch size: 48 | lm loss: 6.411311E+00 | loss scale: 4096.0 | grad norm: 69582.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5372/ 159576 | consumed samples: 139728 | elapsed time per iteration (ms): 15517.9 | learning rate: 3.867E-05 | global batch size: 48 | lm loss: 6.515477E+00 | loss scale: 4096.0 | grad norm: 75626.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5373/ 159576 | consumed samples: 139776 | elapsed time per iteration (ms): 15491.8 | learning rate: 3.868E-05 | global batch size: 48 | lm loss: 6.453342E+00 | loss scale: 4096.0 | grad norm: 69940.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5374/ 159576 | consumed samples: 139824 | elapsed time per iteration (ms): 15511.6 | learning rate: 3.870E-05 | global batch size: 48 | lm loss: 6.378087E+00 | loss scale: 4096.0 | grad norm: 70420.660 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5375/ 159576 | consumed samples: 139872 | elapsed time per iteration (ms): 15836.7 | learning rate: 3.871E-05 | global batch size: 48 | lm loss: 6.371119E+00 | loss scale: 4096.0 | grad norm: 56046.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5376/ 159576 | consumed samples: 139920 | elapsed time per iteration (ms): 15468.7 | learning rate: 3.872E-05 | global batch size: 48 | lm loss: 6.480386E+00 | loss scale: 4096.0 | grad norm: 67254.408 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5377/ 159576 | consumed samples: 139968 | elapsed time per iteration (ms): 15505.8 | learning rate: 3.874E-05 | global batch size: 48 | lm loss: 6.445705E+00 | loss scale: 4096.0 | grad norm: 58120.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5378/ 159576 | consumed samples: 140016 | elapsed time per iteration (ms): 15512.2 | learning rate: 3.875E-05 | global batch size: 48 | lm loss: 6.383876E+00 | loss scale: 4096.0 | grad norm: 63811.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5379/ 159576 | consumed samples: 140064 | elapsed time per iteration (ms): 15885.3 | learning rate: 3.876E-05 | global batch size: 48 | lm loss: 6.430426E+00 | loss scale: 4096.0 | grad norm: 71627.105 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5380/ 159576 | consumed samples: 140112 | elapsed time per iteration (ms): 15514.4 | learning rate: 3.878E-05 | global batch size: 48 | lm loss: 6.352599E+00 | loss scale: 4096.0 | grad norm: 55768.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5381/ 159576 | consumed samples: 140160 | elapsed time per iteration (ms): 15536.5 | learning rate: 3.879E-05 | global batch size: 48 | lm loss: 6.462265E+00 | loss scale: 4096.0 | grad norm: 76307.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5382/ 159576 | consumed samples: 140208 | elapsed time per iteration (ms): 15499.8 | learning rate: 3.880E-05 | global batch size: 48 | lm loss: 6.439154E+00 | loss scale: 4096.0 | grad norm: 97619.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5383/ 159576 | consumed samples: 140256 | elapsed time per iteration (ms): 15693.9 | learning rate: 3.882E-05 | global batch size: 48 | lm loss: 6.327425E+00 | loss scale: 4096.0 | grad norm: 69803.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5384/ 159576 | consumed samples: 140304 | elapsed time per iteration (ms): 15550.5 | learning rate: 3.883E-05 | global batch size: 48 | lm loss: 6.391693E+00 | loss scale: 4096.0 | grad norm: 66211.348 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5385/ 159576 | consumed samples: 140352 | elapsed time per iteration (ms): 15520.0 | learning rate: 3.884E-05 | global batch size: 48 | lm loss: 6.323473E+00 | loss scale: 4096.0 | grad norm: 68034.810 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5386/ 159576 | consumed samples: 140400 | elapsed time per iteration (ms): 15545.0 | learning rate: 3.886E-05 | global batch size: 48 | lm loss: 6.299393E+00 | loss scale: 4096.0 | grad norm: 85492.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5387/ 159576 | consumed samples: 140448 | elapsed time per iteration (ms): 15684.9 | learning rate: 3.887E-05 | global batch size: 48 | lm loss: 6.374225E+00 | loss scale: 4096.0 | grad norm: 72949.757 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5388/ 159576 | consumed samples: 140496 | elapsed time per iteration (ms): 15553.2 | learning rate: 3.888E-05 | global batch size: 48 | lm loss: 6.446224E+00 | loss scale: 4096.0 | grad norm: 83315.401 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5389/ 159576 | consumed samples: 140544 | elapsed time per iteration (ms): 15520.1 | learning rate: 3.890E-05 | global batch size: 48 | lm loss: 6.336344E+00 | loss scale: 4096.0 | grad norm: 60566.619 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5390/ 159576 | consumed samples: 140592 | elapsed time per iteration (ms): 15438.2 | learning rate: 3.891E-05 | global batch size: 48 | lm loss: 6.437949E+00 | loss scale: 4096.0 | grad norm: 93800.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5391/ 159576 | consumed samples: 140640 | elapsed time per iteration (ms): 15842.4 | learning rate: 3.892E-05 | global batch size: 48 | lm loss: 6.445059E+00 | loss scale: 4096.0 | grad norm: 67207.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5392/ 159576 | consumed samples: 140688 | elapsed time per iteration (ms): 15543.4 | learning rate: 3.894E-05 | global batch size: 48 | lm loss: 6.340952E+00 | loss scale: 4096.0 | grad norm: 92289.634 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5393/ 159576 | consumed samples: 140736 | elapsed time per iteration (ms): 15518.9 | learning rate: 3.895E-05 | global batch size: 48 | lm loss: 6.416577E+00 | loss scale: 4096.0 | grad norm: 84099.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5394/ 159576 | consumed samples: 140784 | elapsed time per iteration (ms): 15997.3 | learning rate: 3.896E-05 | global batch size: 48 | lm loss: 6.439622E+00 | loss scale: 4096.0 | grad norm: 54809.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5395/ 159576 | consumed samples: 140832 | elapsed time per iteration (ms): 15450.3 | learning rate: 3.898E-05 | global batch size: 48 | lm loss: 6.441430E+00 | loss scale: 4096.0 | grad norm: 63144.662 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5396/ 159576 | consumed samples: 140880 | elapsed time per iteration (ms): 15568.2 | learning rate: 3.899E-05 | global batch size: 48 | lm loss: 6.424047E+00 | loss scale: 4096.0 | grad norm: 106261.057 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5397/ 159576 | consumed samples: 140928 | elapsed time per iteration (ms): 15464.4 | learning rate: 3.900E-05 | global batch size: 48 | lm loss: 6.325677E+00 | loss scale: 4096.0 | grad norm: 64383.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5398/ 159576 | consumed samples: 140976 | elapsed time per iteration (ms): 15883.9 | learning rate: 3.902E-05 | global batch size: 48 | lm loss: 6.582463E+00 | loss scale: 4096.0 | grad norm: 66662.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5399/ 159576 | consumed samples: 141024 | elapsed time per iteration (ms): 15497.5 | learning rate: 3.903E-05 | global batch size: 48 | lm loss: 6.498641E+00 | loss scale: 4096.0 | grad norm: 59391.511 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5400/ 159576 | consumed samples: 141072 | elapsed time per iteration (ms): 15569.9 | learning rate: 3.904E-05 | global batch size: 48 | lm loss: 6.283938E+00 | loss scale: 4096.0 | grad norm: 64487.813 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5401/ 159576 | consumed samples: 141120 | elapsed time per iteration (ms): 15526.8 | learning rate: 3.906E-05 | global batch size: 48 | lm loss: 6.336715E+00 | loss scale: 4096.0 | grad norm: 57781.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5402/ 159576 | consumed samples: 141168 | elapsed time per iteration (ms): 15981.6 | learning rate: 3.907E-05 | global batch size: 48 | lm loss: 6.293415E+00 | loss scale: 4096.0 | grad norm: 92738.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5403/ 159576 | consumed samples: 141216 | elapsed time per iteration (ms): 15632.0 | learning rate: 3.908E-05 | global batch size: 48 | lm loss: 6.294649E+00 | loss scale: 4096.0 | grad norm: 62910.047 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5404/ 159576 | consumed samples: 141264 | elapsed time per iteration (ms): 15497.6 | learning rate: 3.910E-05 | global batch size: 48 | lm loss: 6.331801E+00 | loss scale: 4096.0 | grad norm: 64648.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5405/ 159576 | consumed samples: 141312 | elapsed time per iteration (ms): 15498.1 | learning rate: 3.911E-05 | global batch size: 48 | lm loss: 6.406822E+00 | loss scale: 4096.0 | grad norm: 71416.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5406/ 159576 | consumed samples: 141360 | elapsed time per iteration (ms): 15867.4 | learning rate: 3.912E-05 | global batch size: 48 | lm loss: 6.404875E+00 | loss scale: 4096.0 | grad norm: 56955.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5407/ 159576 | consumed samples: 141408 | elapsed time per iteration (ms): 15506.2 | learning rate: 3.914E-05 | global batch size: 48 | lm loss: 6.428100E+00 | loss scale: 4096.0 | grad norm: 65410.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5408/ 159576 | consumed samples: 141456 | elapsed time per iteration (ms): 15573.9 | learning rate: 3.915E-05 | global batch size: 48 | lm loss: 6.352518E+00 | loss scale: 4096.0 | grad norm: 57463.162 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5409/ 159576 | consumed samples: 141504 | elapsed time per iteration (ms): 15570.8 | learning rate: 3.916E-05 | global batch size: 48 | lm loss: 6.276915E+00 | loss scale: 4096.0 | grad norm: 56808.465 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5410/ 159576 | consumed samples: 141552 | elapsed time per iteration (ms): 15647.9 | learning rate: 3.918E-05 | global batch size: 48 | lm loss: 6.388402E+00 | loss scale: 4096.0 | grad norm: 55831.269 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5411/ 159576 | consumed samples: 141600 | elapsed time per iteration (ms): 15527.8 | learning rate: 3.919E-05 | global batch size: 48 | lm loss: 6.359324E+00 | loss scale: 4096.0 | grad norm: 58176.863 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5412/ 159576 | consumed samples: 141648 | elapsed time per iteration (ms): 15485.9 | learning rate: 3.920E-05 | global batch size: 48 | lm loss: 6.410316E+00 | loss scale: 4096.0 | grad norm: 58797.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5413/ 159576 | consumed samples: 141696 | elapsed time per iteration (ms): 15570.6 | learning rate: 3.922E-05 | global batch size: 48 | lm loss: 6.487602E+00 | loss scale: 4096.0 | grad norm: 54779.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5414/ 159576 | consumed samples: 141744 | elapsed time per iteration (ms): 15692.4 | learning rate: 3.923E-05 | global batch size: 48 | lm loss: 6.538764E+00 | loss scale: 4096.0 | grad norm: 56952.810 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5415/ 159576 | consumed samples: 141808 | elapsed time per iteration (ms): 16423.4 | learning rate: 3.925E-05 | global batch size: 64 | lm loss: 6.468464E+00 | loss scale: 4096.0 | grad norm: 47962.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5416/ 159576 | consumed samples: 141872 | elapsed time per iteration (ms): 16486.4 | learning rate: 3.927E-05 | global batch size: 64 | lm loss: 6.358836E+00 | loss scale: 4096.0 | grad norm: 79746.041 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5417/ 159576 | consumed samples: 141936 | elapsed time per iteration (ms): 16837.9 | learning rate: 3.928E-05 | global batch size: 64 | lm loss: 6.458796E+00 | loss scale: 4096.0 | grad norm: 72485.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5418/ 159576 | consumed samples: 142000 | elapsed time per iteration (ms): 16282.1 | learning rate: 3.930E-05 | global batch size: 64 | lm loss: 6.325031E+00 | loss scale: 4096.0 | grad norm: 50657.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5419/ 159576 | consumed samples: 142064 | elapsed time per iteration (ms): 16473.5 | learning rate: 3.932E-05 | global batch size: 64 | lm loss: 6.393603E+00 | loss scale: 4096.0 | grad norm: 53317.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5420/ 159576 | consumed samples: 142128 | elapsed time per iteration (ms): 16358.3 | learning rate: 3.934E-05 | global batch size: 64 | lm loss: 6.505975E+00 | loss scale: 4096.0 | grad norm: 76759.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5421/ 159576 | consumed samples: 142192 | elapsed time per iteration (ms): 16646.9 | learning rate: 3.936E-05 | global batch size: 64 | lm loss: 6.377459E+00 | loss scale: 4096.0 | grad norm: 61658.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5422/ 159576 | consumed samples: 142256 | elapsed time per iteration (ms): 16480.4 | learning rate: 3.937E-05 | global batch size: 64 | lm loss: 6.350579E+00 | loss scale: 4096.0 | grad norm: 61672.596 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5423/ 159576 | consumed samples: 142320 | elapsed time per iteration (ms): 16500.8 | learning rate: 3.939E-05 | global batch size: 64 | lm loss: 6.359305E+00 | loss scale: 4096.0 | grad norm: 71934.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5424/ 159576 | consumed samples: 142384 | elapsed time per iteration (ms): 16400.7 | learning rate: 3.941E-05 | global batch size: 64 | lm loss: 6.515474E+00 | loss scale: 4096.0 | grad norm: 62262.598 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5425/ 159576 | consumed samples: 142448 | elapsed time per iteration (ms): 16686.7 | learning rate: 3.943E-05 | global batch size: 64 | lm loss: 6.377324E+00 | loss scale: 4096.0 | grad norm: 66128.264 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5426/ 159576 | consumed samples: 142512 | elapsed time per iteration (ms): 16346.9 | learning rate: 3.944E-05 | global batch size: 64 | lm loss: 6.394655E+00 | loss scale: 4096.0 | grad norm: 64276.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5427/ 159576 | consumed samples: 142576 | elapsed time per iteration (ms): 16454.0 | learning rate: 3.946E-05 | global batch size: 64 | lm loss: 6.417256E+00 | loss scale: 4096.0 | grad norm: 55916.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5428/ 159576 | consumed samples: 142640 | elapsed time per iteration (ms): 16713.8 | learning rate: 3.948E-05 | global batch size: 64 | lm loss: 6.314127E+00 | loss scale: 4096.0 | grad norm: 65443.157 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5429/ 159576 | consumed samples: 142704 | elapsed time per iteration (ms): 16492.7 | learning rate: 3.950E-05 | global batch size: 64 | lm loss: 6.349669E+00 | loss scale: 4096.0 | grad norm: 64819.083 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5430/ 159576 | consumed samples: 142768 | elapsed time per iteration (ms): 16430.1 | learning rate: 3.951E-05 | global batch size: 64 | lm loss: 6.406096E+00 | loss scale: 4096.0 | grad norm: 72027.252 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5431/ 159576 | consumed samples: 142832 | elapsed time per iteration (ms): 16452.9 | learning rate: 3.953E-05 | global batch size: 64 | lm loss: 6.422045E+00 | loss scale: 4096.0 | grad norm: 59470.191 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5432/ 159576 | consumed samples: 142896 | elapsed time per iteration (ms): 16574.0 | learning rate: 3.955E-05 | global batch size: 64 | lm loss: 6.384964E+00 | loss scale: 4096.0 | grad norm: 59229.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5433/ 159576 | consumed samples: 142960 | elapsed time per iteration (ms): 16448.4 | learning rate: 3.957E-05 | global batch size: 64 | lm loss: 6.388242E+00 | loss scale: 4096.0 | grad norm: 51139.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5434/ 159576 | consumed samples: 143024 | elapsed time per iteration (ms): 16378.2 | learning rate: 3.959E-05 | global batch size: 64 | lm loss: 6.422913E+00 | loss scale: 4096.0 | grad norm: 55548.958 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5435/ 159576 | consumed samples: 143088 | elapsed time per iteration (ms): 16838.8 | learning rate: 3.960E-05 | global batch size: 64 | lm loss: 6.399693E+00 | loss scale: 4096.0 | grad norm: 87728.143 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5436/ 159576 | consumed samples: 143152 | elapsed time per iteration (ms): 16458.9 | learning rate: 3.962E-05 | global batch size: 64 | lm loss: 6.291359E+00 | loss scale: 4096.0 | grad norm: 65955.697 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5437/ 159576 | consumed samples: 143216 | elapsed time per iteration (ms): 16425.2 | learning rate: 3.964E-05 | global batch size: 64 | lm loss: 6.367932E+00 | loss scale: 4096.0 | grad norm: 63150.328 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5438/ 159576 | consumed samples: 143280 | elapsed time per iteration (ms): 16418.8 | learning rate: 3.966E-05 | global batch size: 64 | lm loss: 6.365756E+00 | loss scale: 4096.0 | grad norm: 57427.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5439/ 159576 | consumed samples: 143344 | elapsed time per iteration (ms): 16802.3 | learning rate: 3.967E-05 | global batch size: 64 | lm loss: 6.415596E+00 | loss scale: 4096.0 | grad norm: 61605.287 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5440/ 159576 | consumed samples: 143408 | elapsed time per iteration (ms): 16516.9 | learning rate: 3.969E-05 | global batch size: 64 | lm loss: 6.414165E+00 | loss scale: 4096.0 | grad norm: 64434.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5441/ 159576 | consumed samples: 143472 | elapsed time per iteration (ms): 16398.0 | learning rate: 3.971E-05 | global batch size: 64 | lm loss: 6.425170E+00 | loss scale: 4096.0 | grad norm: 63830.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5442/ 159576 | consumed samples: 143536 | elapsed time per iteration (ms): 16330.0 | learning rate: 3.973E-05 | global batch size: 64 | lm loss: 6.420317E+00 | loss scale: 4096.0 | grad norm: 80818.483 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5443/ 159576 | consumed samples: 143600 | elapsed time per iteration (ms): 16646.2 | learning rate: 3.975E-05 | global batch size: 64 | lm loss: 6.404300E+00 | loss scale: 4096.0 | grad norm: 66058.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5444/ 159576 | consumed samples: 143664 | elapsed time per iteration (ms): 16389.9 | learning rate: 3.976E-05 | global batch size: 64 | lm loss: 6.307170E+00 | loss scale: 4096.0 | grad norm: 64553.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5445/ 159576 | consumed samples: 143728 | elapsed time per iteration (ms): 16425.8 | learning rate: 3.978E-05 | global batch size: 64 | lm loss: 6.474117E+00 | loss scale: 4096.0 | grad norm: 54414.389 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5446/ 159576 | consumed samples: 143792 | elapsed time per iteration (ms): 16855.6 | learning rate: 3.980E-05 | global batch size: 64 | lm loss: 6.329272E+00 | loss scale: 4096.0 | grad norm: 67896.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5447/ 159576 | consumed samples: 143856 | elapsed time per iteration (ms): 16363.1 | learning rate: 3.982E-05 | global batch size: 64 | lm loss: 6.485427E+00 | loss scale: 4096.0 | grad norm: 55200.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5448/ 159576 | consumed samples: 143920 | elapsed time per iteration (ms): 16446.4 | learning rate: 3.983E-05 | global batch size: 64 | lm loss: 6.474103E+00 | loss scale: 4096.0 | grad norm: 58759.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5449/ 159576 | consumed samples: 143984 | elapsed time per iteration (ms): 16365.5 | learning rate: 3.985E-05 | global batch size: 64 | lm loss: 6.386650E+00 | loss scale: 4096.0 | grad norm: 69075.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5450/ 159576 | consumed samples: 144048 | elapsed time per iteration (ms): 16855.4 | learning rate: 3.987E-05 | global batch size: 64 | lm loss: 6.407839E+00 | loss scale: 4096.0 | grad norm: 76751.714 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5451/ 159576 | consumed samples: 144112 | elapsed time per iteration (ms): 16481.2 | learning rate: 3.989E-05 | global batch size: 64 | lm loss: 6.437217E+00 | loss scale: 4096.0 | grad norm: 60762.834 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5452/ 159576 | consumed samples: 144176 | elapsed time per iteration (ms): 16387.3 | learning rate: 3.991E-05 | global batch size: 64 | lm loss: 6.391966E+00 | loss scale: 4096.0 | grad norm: 57835.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5453/ 159576 | consumed samples: 144240 | elapsed time per iteration (ms): 16456.9 | learning rate: 3.992E-05 | global batch size: 64 | lm loss: 6.407461E+00 | loss scale: 4096.0 | grad norm: 56276.948 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5454/ 159576 | consumed samples: 144304 | elapsed time per iteration (ms): 16533.3 | learning rate: 3.994E-05 | global batch size: 64 | lm loss: 6.319425E+00 | loss scale: 4096.0 | grad norm: 66856.562 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5455/ 159576 | consumed samples: 144368 | elapsed time per iteration (ms): 16417.1 | learning rate: 3.996E-05 | global batch size: 64 | lm loss: 6.377168E+00 | loss scale: 4096.0 | grad norm: 53863.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5456/ 159576 | consumed samples: 144432 | elapsed time per iteration (ms): 16422.1 | learning rate: 3.998E-05 | global batch size: 64 | lm loss: 6.368913E+00 | loss scale: 4096.0 | grad norm: 63261.354 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5457/ 159576 | consumed samples: 144496 | elapsed time per iteration (ms): 16738.2 | learning rate: 3.999E-05 | global batch size: 64 | lm loss: 6.264383E+00 | loss scale: 4096.0 | grad norm: 64656.043 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5458/ 159576 | consumed samples: 144560 | elapsed time per iteration (ms): 16315.9 | learning rate: 4.001E-05 | global batch size: 64 | lm loss: 6.410008E+00 | loss scale: 4096.0 | grad norm: 82472.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5459/ 159576 | consumed samples: 144624 | elapsed time per iteration (ms): 16385.7 | learning rate: 4.003E-05 | global batch size: 64 | lm loss: 6.419100E+00 | loss scale: 4096.0 | grad norm: 81581.674 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5460/ 159576 | consumed samples: 144688 | elapsed time per iteration (ms): 16422.6 | learning rate: 4.005E-05 | global batch size: 64 | lm loss: 6.374327E+00 | loss scale: 4096.0 | grad norm: 77883.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5461/ 159576 | consumed samples: 144752 | elapsed time per iteration (ms): 16514.0 | learning rate: 4.007E-05 | global batch size: 64 | lm loss: 6.323710E+00 | loss scale: 4096.0 | grad norm: 59535.385 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5462/ 159576 | consumed samples: 144816 | elapsed time per iteration (ms): 16520.4 | learning rate: 4.008E-05 | global batch size: 64 | lm loss: 6.325150E+00 | loss scale: 4096.0 | grad norm: 54807.099 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5463/ 159576 | consumed samples: 144880 | elapsed time per iteration (ms): 16362.9 | learning rate: 4.010E-05 | global batch size: 64 | lm loss: 6.461391E+00 | loss scale: 4096.0 | grad norm: 74839.084 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5464/ 159576 | consumed samples: 144944 | elapsed time per iteration (ms): 16408.3 | learning rate: 4.012E-05 | global batch size: 64 | lm loss: 6.392217E+00 | loss scale: 4096.0 | grad norm: 61727.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5465/ 159576 | consumed samples: 145008 | elapsed time per iteration (ms): 16556.8 | learning rate: 4.014E-05 | global batch size: 64 | lm loss: 6.349445E+00 | loss scale: 4096.0 | grad norm: 90938.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5466/ 159576 | consumed samples: 145072 | elapsed time per iteration (ms): 16389.1 | learning rate: 4.015E-05 | global batch size: 64 | lm loss: 6.314983E+00 | loss scale: 4096.0 | grad norm: 62408.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5467/ 159576 | consumed samples: 145136 | elapsed time per iteration (ms): 16364.1 | learning rate: 4.017E-05 | global batch size: 64 | lm loss: 6.412921E+00 | loss scale: 4096.0 | grad norm: 82535.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5468/ 159576 | consumed samples: 145200 | elapsed time per iteration (ms): 16712.9 | learning rate: 4.019E-05 | global batch size: 64 | lm loss: 6.508467E+00 | loss scale: 4096.0 | grad norm: 53388.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5469/ 159576 | consumed samples: 145264 | elapsed time per iteration (ms): 16357.7 | learning rate: 4.021E-05 | global batch size: 64 | lm loss: 6.367021E+00 | loss scale: 4096.0 | grad norm: 88053.691 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5470/ 159576 | consumed samples: 145328 | elapsed time per iteration (ms): 16424.7 | learning rate: 4.022E-05 | global batch size: 64 | lm loss: 6.396588E+00 | loss scale: 4096.0 | grad norm: 83281.076 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5471/ 159576 | consumed samples: 145392 | elapsed time per iteration (ms): 16363.6 | learning rate: 4.024E-05 | global batch size: 64 | lm loss: 6.387273E+00 | loss scale: 4096.0 | grad norm: 56875.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5472/ 159576 | consumed samples: 145456 | elapsed time per iteration (ms): 16523.2 | learning rate: 4.026E-05 | global batch size: 64 | lm loss: 6.456463E+00 | loss scale: 4096.0 | grad norm: 60270.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5473/ 159576 | consumed samples: 145520 | elapsed time per iteration (ms): 16398.7 | learning rate: 4.028E-05 | global batch size: 64 | lm loss: 6.460003E+00 | loss scale: 4096.0 | grad norm: 61151.257 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5474/ 159576 | consumed samples: 145584 | elapsed time per iteration (ms): 16345.5 | learning rate: 4.030E-05 | global batch size: 64 | lm loss: 6.443559E+00 | loss scale: 4096.0 | grad norm: 83130.420 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5475/ 159576 | consumed samples: 145648 | elapsed time per iteration (ms): 16591.9 | learning rate: 4.031E-05 | global batch size: 64 | lm loss: 6.454519E+00 | loss scale: 4096.0 | grad norm: 69198.394 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5476/ 159576 | consumed samples: 145712 | elapsed time per iteration (ms): 16643.0 | learning rate: 4.033E-05 | global batch size: 64 | lm loss: 6.424469E+00 | loss scale: 4096.0 | grad norm: 57626.523 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5477/ 159576 | consumed samples: 145776 | elapsed time per iteration (ms): 16362.1 | learning rate: 4.035E-05 | global batch size: 64 | lm loss: 6.285227E+00 | loss scale: 4096.0 | grad norm: 87864.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5478/ 159576 | consumed samples: 145840 | elapsed time per iteration (ms): 16435.9 | learning rate: 4.037E-05 | global batch size: 64 | lm loss: 6.372074E+00 | loss scale: 4096.0 | grad norm: 67542.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5479/ 159576 | consumed samples: 145904 | elapsed time per iteration (ms): 16597.3 | learning rate: 4.038E-05 | global batch size: 64 | lm loss: 6.438199E+00 | loss scale: 4096.0 | grad norm: 74119.106 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5480/ 159576 | consumed samples: 145968 | elapsed time per iteration (ms): 16483.8 | learning rate: 4.040E-05 | global batch size: 64 | lm loss: 6.487626E+00 | loss scale: 4096.0 | grad norm: 68136.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 00:07:47] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-25 00:07:47] PULSE: tr8-104B is running for 18:15:36 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 5481/ 159576 | consumed samples: 146032 | elapsed time per iteration (ms): 16373.0 | learning rate: 4.042E-05 | global batch size: 64 | lm loss: 6.280901E+00 | loss scale: 4096.0 | grad norm: 89214.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5482/ 159576 | consumed samples: 146096 | elapsed time per iteration (ms): 16391.1 | learning rate: 4.044E-05 | global batch size: 64 | lm loss: 6.407492E+00 | loss scale: 4096.0 | grad norm: 71190.860 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5483/ 159576 | consumed samples: 146160 | elapsed time per iteration (ms): 16510.6 | learning rate: 4.046E-05 | global batch size: 64 | lm loss: 6.338043E+00 | loss scale: 4096.0 | grad norm: 80052.869 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5484/ 159576 | consumed samples: 146224 | elapsed time per iteration (ms): 16428.2 | learning rate: 4.047E-05 | global batch size: 64 | lm loss: 6.381162E+00 | loss scale: 4096.0 | grad norm: 66785.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5485/ 159576 | consumed samples: 146288 | elapsed time per iteration (ms): 16390.1 | learning rate: 4.049E-05 | global batch size: 64 | lm loss: 6.377982E+00 | loss scale: 4096.0 | grad norm: 73739.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5486/ 159576 | consumed samples: 146352 | elapsed time per iteration (ms): 16772.0 | learning rate: 4.051E-05 | global batch size: 64 | lm loss: 6.417017E+00 | loss scale: 4096.0 | grad norm: 101012.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5487/ 159576 | consumed samples: 146416 | elapsed time per iteration (ms): 16505.3 | learning rate: 4.053E-05 | global batch size: 64 | lm loss: 6.375125E+00 | loss scale: 4096.0 | grad norm: 62796.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5488/ 159576 | consumed samples: 146480 | elapsed time per iteration (ms): 16398.9 | learning rate: 4.054E-05 | global batch size: 64 | lm loss: 6.370068E+00 | loss scale: 4096.0 | grad norm: 53653.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5489/ 159576 | consumed samples: 146544 | elapsed time per iteration (ms): 16369.7 | learning rate: 4.056E-05 | global batch size: 64 | lm loss: 6.376281E+00 | loss scale: 4096.0 | grad norm: 81099.504 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5490/ 159576 | consumed samples: 146608 | elapsed time per iteration (ms): 16827.2 | learning rate: 4.058E-05 | global batch size: 64 | lm loss: 6.479604E+00 | loss scale: 4096.0 | grad norm: 63855.765 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5491/ 159576 | consumed samples: 146672 | elapsed time per iteration (ms): 16415.6 | learning rate: 4.060E-05 | global batch size: 64 | lm loss: 6.352095E+00 | loss scale: 4096.0 | grad norm: 55122.067 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5492/ 159576 | consumed samples: 146736 | elapsed time per iteration (ms): 16444.9 | learning rate: 4.062E-05 | global batch size: 64 | lm loss: 6.506047E+00 | loss scale: 4096.0 | grad norm: 75137.891 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5493/ 159576 | consumed samples: 146800 | elapsed time per iteration (ms): 16342.5 | learning rate: 4.063E-05 | global batch size: 64 | lm loss: 6.379695E+00 | loss scale: 4096.0 | grad norm: 66901.698 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5494/ 159576 | consumed samples: 146864 | elapsed time per iteration (ms): 16502.1 | learning rate: 4.065E-05 | global batch size: 64 | lm loss: 6.368460E+00 | loss scale: 4096.0 | grad norm: 77897.280 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5495/ 159576 | consumed samples: 146928 | elapsed time per iteration (ms): 16338.1 | learning rate: 4.067E-05 | global batch size: 64 | lm loss: 6.329938E+00 | loss scale: 4096.0 | grad norm: 61931.764 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5496/ 159576 | consumed samples: 146992 | elapsed time per iteration (ms): 16346.0 | learning rate: 4.069E-05 | global batch size: 64 | lm loss: 6.425272E+00 | loss scale: 4096.0 | grad norm: 66524.327 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5497/ 159576 | consumed samples: 147056 | elapsed time per iteration (ms): 16765.2 | learning rate: 4.070E-05 | global batch size: 64 | lm loss: 6.296051E+00 | loss scale: 4096.0 | grad norm: 85285.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5498/ 159576 | consumed samples: 147120 | elapsed time per iteration (ms): 16329.2 | learning rate: 4.072E-05 | global batch size: 64 | lm loss: 6.365289E+00 | loss scale: 4096.0 | grad norm: 66015.174 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5499/ 159576 | consumed samples: 147184 | elapsed time per iteration (ms): 16383.4 | learning rate: 4.074E-05 | global batch size: 64 | lm loss: 6.294851E+00 | loss scale: 4096.0 | grad norm: 79758.414 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5500/ 159576 | consumed samples: 147248 | elapsed time per iteration (ms): 16337.1 | learning rate: 4.076E-05 | global batch size: 64 | lm loss: 6.289442E+00 | loss scale: 4096.0 | grad norm: 74687.965 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5501/ 159576 | consumed samples: 147312 | elapsed time per iteration (ms): 16790.4 | learning rate: 4.078E-05 | global batch size: 64 | lm loss: 6.322903E+00 | loss scale: 4096.0 | grad norm: 77364.060 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5502/ 159576 | consumed samples: 147376 | elapsed time per iteration (ms): 16423.5 | learning rate: 4.079E-05 | global batch size: 64 | lm loss: 6.460203E+00 | loss scale: 4096.0 | grad norm: 73803.838 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5503/ 159576 | consumed samples: 147440 | elapsed time per iteration (ms): 16368.8 | learning rate: 4.081E-05 | global batch size: 64 | lm loss: 6.396315E+00 | loss scale: 4096.0 | grad norm: 71129.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5504/ 159576 | consumed samples: 147504 | elapsed time per iteration (ms): 16346.2 | learning rate: 4.083E-05 | global batch size: 64 | lm loss: 6.425894E+00 | loss scale: 4096.0 | grad norm: 98647.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5505/ 159576 | consumed samples: 147568 | elapsed time per iteration (ms): 16678.7 | learning rate: 4.085E-05 | global batch size: 64 | lm loss: 6.381792E+00 | loss scale: 4096.0 | grad norm: 89626.671 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5506/ 159576 | consumed samples: 147632 | elapsed time per iteration (ms): 16332.5 | learning rate: 4.086E-05 | global batch size: 64 | lm loss: 6.483613E+00 | loss scale: 4096.0 | grad norm: 94069.099 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5507/ 159576 | consumed samples: 147696 | elapsed time per iteration (ms): 16400.4 | learning rate: 4.088E-05 | global batch size: 64 | lm loss: 6.236539E+00 | loss scale: 4096.0 | grad norm: 66871.431 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5508/ 159576 | consumed samples: 147760 | elapsed time per iteration (ms): 16657.8 | learning rate: 4.090E-05 | global batch size: 64 | lm loss: 6.445796E+00 | loss scale: 4096.0 | grad norm: 79385.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5509/ 159576 | consumed samples: 147824 | elapsed time per iteration (ms): 16347.0 | learning rate: 4.092E-05 | global batch size: 64 | lm loss: 6.421635E+00 | loss scale: 4096.0 | grad norm: 76910.947 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5510/ 159576 | consumed samples: 147888 | elapsed time per iteration (ms): 16379.6 | learning rate: 4.093E-05 | global batch size: 64 | lm loss: 6.403854E+00 | loss scale: 4096.0 | grad norm: 131977.376 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5511/ 159576 | consumed samples: 147952 | elapsed time per iteration (ms): 16364.3 | learning rate: 4.095E-05 | global batch size: 64 | lm loss: 6.393543E+00 | loss scale: 4096.0 | grad norm: 62655.958 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5512/ 159576 | consumed samples: 148016 | elapsed time per iteration (ms): 16734.0 | learning rate: 4.097E-05 | global batch size: 64 | lm loss: 6.378099E+00 | loss scale: 4096.0 | grad norm: 71057.330 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5513/ 159576 | consumed samples: 148080 | elapsed time per iteration (ms): 16360.1 | learning rate: 4.099E-05 | global batch size: 64 | lm loss: 6.439700E+00 | loss scale: 4096.0 | grad norm: 78346.761 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5514/ 159576 | consumed samples: 148144 | elapsed time per iteration (ms): 16356.7 | learning rate: 4.101E-05 | global batch size: 64 | lm loss: 6.380426E+00 | loss scale: 4096.0 | grad norm: 65583.994 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5515/ 159576 | consumed samples: 148208 | elapsed time per iteration (ms): 16416.2 | learning rate: 4.102E-05 | global batch size: 64 | lm loss: 6.492000E+00 | loss scale: 4096.0 | grad norm: 73724.763 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5516/ 159576 | consumed samples: 148272 | elapsed time per iteration (ms): 16451.6 | learning rate: 4.104E-05 | global batch size: 64 | lm loss: 6.433869E+00 | loss scale: 4096.0 | grad norm: 93695.526 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5517/ 159576 | consumed samples: 148336 | elapsed time per iteration (ms): 16367.1 | learning rate: 4.106E-05 | global batch size: 64 | lm loss: 6.316652E+00 | loss scale: 4096.0 | grad norm: 93995.663 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5518/ 159576 | consumed samples: 148400 | elapsed time per iteration (ms): 16352.2 | learning rate: 4.108E-05 | global batch size: 64 | lm loss: 6.331068E+00 | loss scale: 4096.0 | grad norm: 64601.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5519/ 159576 | consumed samples: 148464 | elapsed time per iteration (ms): 16660.3 | learning rate: 4.109E-05 | global batch size: 64 | lm loss: 6.441586E+00 | loss scale: 4096.0 | grad norm: 74837.727 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5520/ 159576 | consumed samples: 148528 | elapsed time per iteration (ms): 16346.7 | learning rate: 4.111E-05 | global batch size: 64 | lm loss: 6.422507E+00 | loss scale: 4096.0 | grad norm: 57013.348 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5521/ 159576 | consumed samples: 148592 | elapsed time per iteration (ms): 16378.9 | learning rate: 4.113E-05 | global batch size: 64 | lm loss: 6.388858E+00 | loss scale: 4096.0 | grad norm: 70843.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5522/ 159576 | consumed samples: 148656 | elapsed time per iteration (ms): 16311.3 | learning rate: 4.115E-05 | global batch size: 64 | lm loss: 6.335554E+00 | loss scale: 4096.0 | grad norm: 57811.716 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5523/ 159576 | consumed samples: 148720 | elapsed time per iteration (ms): 16599.0 | learning rate: 4.117E-05 | global batch size: 64 | lm loss: 6.427087E+00 | loss scale: 4096.0 | grad norm: 70169.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5524/ 159576 | consumed samples: 148784 | elapsed time per iteration (ms): 16322.1 | learning rate: 4.118E-05 | global batch size: 64 | lm loss: 6.400644E+00 | loss scale: 4096.0 | grad norm: 65162.867 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5525/ 159576 | consumed samples: 148848 | elapsed time per iteration (ms): 16352.5 | learning rate: 4.120E-05 | global batch size: 64 | lm loss: 6.476854E+00 | loss scale: 4096.0 | grad norm: 105828.693 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5526/ 159576 | consumed samples: 148912 | elapsed time per iteration (ms): 16357.9 | learning rate: 4.122E-05 | global batch size: 64 | lm loss: 6.444851E+00 | loss scale: 4096.0 | grad norm: 100931.662 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5527/ 159576 | consumed samples: 148976 | elapsed time per iteration (ms): 16656.2 | learning rate: 4.124E-05 | global batch size: 64 | lm loss: 6.448713E+00 | loss scale: 4096.0 | grad norm: 81570.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5528/ 159576 | consumed samples: 149040 | elapsed time per iteration (ms): 16320.4 | learning rate: 4.125E-05 | global batch size: 64 | lm loss: 6.406240E+00 | loss scale: 4096.0 | grad norm: 82766.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5529/ 159576 | consumed samples: 149104 | elapsed time per iteration (ms): 16353.3 | learning rate: 4.127E-05 | global batch size: 64 | lm loss: 6.376573E+00 | loss scale: 4096.0 | grad norm: 80155.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5530/ 159576 | consumed samples: 149168 | elapsed time per iteration (ms): 16695.5 | learning rate: 4.129E-05 | global batch size: 64 | lm loss: 6.316214E+00 | loss scale: 4096.0 | grad norm: 87358.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5531/ 159576 | consumed samples: 149232 | elapsed time per iteration (ms): 16408.8 | learning rate: 4.131E-05 | global batch size: 64 | lm loss: 6.481884E+00 | loss scale: 4096.0 | grad norm: 86550.581 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5532/ 159576 | consumed samples: 149296 | elapsed time per iteration (ms): 16343.8 | learning rate: 4.133E-05 | global batch size: 64 | lm loss: 6.483734E+00 | loss scale: 4096.0 | grad norm: 89939.876 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5533/ 159576 | consumed samples: 149360 | elapsed time per iteration (ms): 16370.7 | learning rate: 4.134E-05 | global batch size: 64 | lm loss: 6.318271E+00 | loss scale: 4096.0 | grad norm: 60516.549 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5534/ 159576 | consumed samples: 149424 | elapsed time per iteration (ms): 16594.8 | learning rate: 4.136E-05 | global batch size: 64 | lm loss: 6.391500E+00 | loss scale: 4096.0 | grad norm: 70379.262 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5535/ 159576 | consumed samples: 149488 | elapsed time per iteration (ms): 16425.6 | learning rate: 4.138E-05 | global batch size: 64 | lm loss: 6.418231E+00 | loss scale: 4096.0 | grad norm: 76225.739 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5536/ 159576 | consumed samples: 149552 | elapsed time per iteration (ms): 16364.4 | learning rate: 4.140E-05 | global batch size: 64 | lm loss: 6.461292E+00 | loss scale: 4096.0 | grad norm: 117347.500 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5537/ 159576 | consumed samples: 149616 | elapsed time per iteration (ms): 16683.3 | learning rate: 4.141E-05 | global batch size: 64 | lm loss: 6.394395E+00 | loss scale: 4096.0 | grad norm: 113236.928 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5538/ 159576 | consumed samples: 149680 | elapsed time per iteration (ms): 16407.6 | learning rate: 4.143E-05 | global batch size: 64 | lm loss: 6.348366E+00 | loss scale: 4096.0 | grad norm: 72699.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5539/ 159576 | consumed samples: 149744 | elapsed time per iteration (ms): 16372.4 | learning rate: 4.145E-05 | global batch size: 64 | lm loss: 6.395003E+00 | loss scale: 4096.0 | grad norm: 117054.243 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5540/ 159576 | consumed samples: 149808 | elapsed time per iteration (ms): 16344.7 | learning rate: 4.147E-05 | global batch size: 64 | lm loss: 6.345469E+00 | loss scale: 4096.0 | grad norm: 66826.178 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5541/ 159576 | consumed samples: 149872 | elapsed time per iteration (ms): 16658.7 | learning rate: 4.149E-05 | global batch size: 64 | lm loss: 6.311511E+00 | loss scale: 4096.0 | grad norm: 82398.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5542/ 159576 | consumed samples: 149936 | elapsed time per iteration (ms): 16382.8 | learning rate: 4.150E-05 | global batch size: 64 | lm loss: 6.407408E+00 | loss scale: 4096.0 | grad norm: 95381.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5543/ 159576 | consumed samples: 150000 | elapsed time per iteration (ms): 16397.3 | learning rate: 4.152E-05 | global batch size: 64 | lm loss: 6.385950E+00 | loss scale: 4096.0 | grad norm: 84966.860 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5544/ 159576 | consumed samples: 150064 | elapsed time per iteration (ms): 16328.2 | learning rate: 4.154E-05 | global batch size: 64 | lm loss: 6.386173E+00 | loss scale: 4096.0 | grad norm: 76280.982 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5545/ 159576 | consumed samples: 150128 | elapsed time per iteration (ms): 16536.9 | learning rate: 4.156E-05 | global batch size: 64 | lm loss: 6.429965E+00 | loss scale: 4096.0 | grad norm: 86199.770 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5546/ 159576 | consumed samples: 150192 | elapsed time per iteration (ms): 16341.0 | learning rate: 4.157E-05 | global batch size: 64 | lm loss: 6.440814E+00 | loss scale: 4096.0 | grad norm: 79643.661 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5547/ 159576 | consumed samples: 150256 | elapsed time per iteration (ms): 16434.5 | learning rate: 4.159E-05 | global batch size: 64 | lm loss: 6.292027E+00 | loss scale: 4096.0 | grad norm: 79649.706 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5548/ 159576 | consumed samples: 150320 | elapsed time per iteration (ms): 16744.8 | learning rate: 4.161E-05 | global batch size: 64 | lm loss: 6.363777E+00 | loss scale: 4096.0 | grad norm: 105818.884 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5549/ 159576 | consumed samples: 150384 | elapsed time per iteration (ms): 16446.0 | learning rate: 4.163E-05 | global batch size: 64 | lm loss: 6.525520E+00 | loss scale: 4096.0 | grad norm: 98900.365 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5550/ 159576 | consumed samples: 150448 | elapsed time per iteration (ms): 16313.7 | learning rate: 4.164E-05 | global batch size: 64 | lm loss: 6.426298E+00 | loss scale: 4096.0 | grad norm: 160080.224 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5551/ 159576 | consumed samples: 150512 | elapsed time per iteration (ms): 16414.2 | learning rate: 4.166E-05 | global batch size: 64 | lm loss: 6.409907E+00 | loss scale: 4096.0 | grad norm: 101291.267 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5552/ 159576 | consumed samples: 150576 | elapsed time per iteration (ms): 16772.9 | learning rate: 4.168E-05 | global batch size: 64 | lm loss: 6.312022E+00 | loss scale: 4096.0 | grad norm: 93961.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5553/ 159576 | consumed samples: 150640 | elapsed time per iteration (ms): 16393.9 | learning rate: 4.170E-05 | global batch size: 64 | lm loss: 6.460764E+00 | loss scale: 4096.0 | grad norm: 83044.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5554/ 159576 | consumed samples: 150704 | elapsed time per iteration (ms): 16414.7 | learning rate: 4.172E-05 | global batch size: 64 | lm loss: 6.395907E+00 | loss scale: 4096.0 | grad norm: 71935.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5555/ 159576 | consumed samples: 150768 | elapsed time per iteration (ms): 16459.3 | learning rate: 4.173E-05 | global batch size: 64 | lm loss: 6.381772E+00 | loss scale: 4096.0 | grad norm: 92358.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5556/ 159576 | consumed samples: 150832 | elapsed time per iteration (ms): 16620.5 | learning rate: 4.175E-05 | global batch size: 64 | lm loss: 6.334103E+00 | loss scale: 4096.0 | grad norm: 135953.299 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5557/ 159576 | consumed samples: 150896 | elapsed time per iteration (ms): 16420.0 | learning rate: 4.177E-05 | global batch size: 64 | lm loss: 6.350534E+00 | loss scale: 4096.0 | grad norm: 106866.155 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5558/ 159576 | consumed samples: 150960 | elapsed time per iteration (ms): 16394.5 | learning rate: 4.179E-05 | global batch size: 64 | lm loss: 6.449617E+00 | loss scale: 4096.0 | grad norm: 73758.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5559/ 159576 | consumed samples: 151024 | elapsed time per iteration (ms): 16702.3 | learning rate: 4.180E-05 | global batch size: 64 | lm loss: 6.422152E+00 | loss scale: 4096.0 | grad norm: 89216.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5560/ 159576 | consumed samples: 151088 | elapsed time per iteration (ms): 16526.0 | learning rate: 4.182E-05 | global batch size: 64 | lm loss: 6.502412E+00 | loss scale: 4096.0 | grad norm: 75899.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5561/ 159576 | consumed samples: 151152 | elapsed time per iteration (ms): 16388.8 | learning rate: 4.184E-05 | global batch size: 64 | lm loss: 6.353260E+00 | loss scale: 4096.0 | grad norm: 77216.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5562/ 159576 | consumed samples: 151216 | elapsed time per iteration (ms): 16375.8 | learning rate: 4.186E-05 | global batch size: 64 | lm loss: 6.380834E+00 | loss scale: 4096.0 | grad norm: 108978.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5563/ 159576 | consumed samples: 151280 | elapsed time per iteration (ms): 16840.5 | learning rate: 4.188E-05 | global batch size: 64 | lm loss: 6.389106E+00 | loss scale: 4096.0 | grad norm: 109665.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5564/ 159576 | consumed samples: 151344 | elapsed time per iteration (ms): 16437.6 | learning rate: 4.189E-05 | global batch size: 64 | lm loss: 6.440452E+00 | loss scale: 4096.0 | grad norm: 455190.539 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5565/ 159576 | consumed samples: 151408 | elapsed time per iteration (ms): 16403.9 | learning rate: 4.191E-05 | global batch size: 64 | lm loss: 6.425446E+00 | loss scale: 4096.0 | grad norm: 121150.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5566/ 159576 | consumed samples: 151472 | elapsed time per iteration (ms): 16435.1 | learning rate: 4.193E-05 | global batch size: 64 | lm loss: 6.344089E+00 | loss scale: 4096.0 | grad norm: 92189.151 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5567/ 159576 | consumed samples: 151536 | elapsed time per iteration (ms): 16459.4 | learning rate: 4.195E-05 | global batch size: 64 | lm loss: 6.402337E+00 | loss scale: 4096.0 | grad norm: 84995.771 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5568/ 159576 | consumed samples: 151600 | elapsed time per iteration (ms): 16389.2 | learning rate: 4.196E-05 | global batch size: 64 | lm loss: 6.522965E+00 | loss scale: 4096.0 | grad norm: 82583.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5569/ 159576 | consumed samples: 151664 | elapsed time per iteration (ms): 16371.9 | learning rate: 4.198E-05 | global batch size: 64 | lm loss: 6.357002E+00 | loss scale: 4096.0 | grad norm: 107776.266 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5570/ 159576 | consumed samples: 151728 | elapsed time per iteration (ms): 16715.6 | learning rate: 4.200E-05 | global batch size: 64 | lm loss: 6.462955E+00 | loss scale: 4096.0 | grad norm: 81656.007 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5571/ 159576 | consumed samples: 151792 | elapsed time per iteration (ms): 16448.5 | learning rate: 4.202E-05 | global batch size: 64 | lm loss: 6.378518E+00 | loss scale: 4096.0 | grad norm: 97168.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5572/ 159576 | consumed samples: 151856 | elapsed time per iteration (ms): 16375.2 | learning rate: 4.204E-05 | global batch size: 64 | lm loss: 6.426227E+00 | loss scale: 4096.0 | grad norm: 138499.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5573/ 159576 | consumed samples: 151920 | elapsed time per iteration (ms): 16391.0 | learning rate: 4.205E-05 | global batch size: 64 | lm loss: 6.467142E+00 | loss scale: 4096.0 | grad norm: 86986.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5574/ 159576 | consumed samples: 151984 | elapsed time per iteration (ms): 16660.3 | learning rate: 4.207E-05 | global batch size: 64 | lm loss: 6.343758E+00 | loss scale: 4096.0 | grad norm: 94104.183 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5575/ 159576 | consumed samples: 152048 | elapsed time per iteration (ms): 16384.3 | learning rate: 4.209E-05 | global batch size: 64 | lm loss: 6.264513E+00 | loss scale: 4096.0 | grad norm: 84463.915 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5576/ 159576 | consumed samples: 152112 | elapsed time per iteration (ms): 16429.0 | learning rate: 4.211E-05 | global batch size: 64 | lm loss: 6.395695E+00 | loss scale: 4096.0 | grad norm: 91060.071 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5577/ 159576 | consumed samples: 152176 | elapsed time per iteration (ms): 16399.6 | learning rate: 4.212E-05 | global batch size: 64 | lm loss: 6.322819E+00 | loss scale: 4096.0 | grad norm: 78884.092 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5578/ 159576 | consumed samples: 152240 | elapsed time per iteration (ms): 16529.4 | learning rate: 4.214E-05 | global batch size: 64 | lm loss: 6.361033E+00 | loss scale: 4096.0 | grad norm: 132712.269 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5579/ 159576 | consumed samples: 152304 | elapsed time per iteration (ms): 16454.4 | learning rate: 4.216E-05 | global batch size: 64 | lm loss: 6.276022E+00 | loss scale: 4096.0 | grad norm: 112417.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5580/ 159576 | consumed samples: 152368 | elapsed time per iteration (ms): 16401.1 | learning rate: 4.218E-05 | global batch size: 64 | lm loss: 6.375633E+00 | loss scale: 4096.0 | grad norm: 85824.899 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5581/ 159576 | consumed samples: 152432 | elapsed time per iteration (ms): 16688.1 | learning rate: 4.220E-05 | global batch size: 64 | lm loss: 6.447036E+00 | loss scale: 4096.0 | grad norm: 88314.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5582/ 159576 | consumed samples: 152496 | elapsed time per iteration (ms): 16427.8 | learning rate: 4.221E-05 | global batch size: 64 | lm loss: 6.438461E+00 | loss scale: 4096.0 | grad norm: 91826.151 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5583/ 159576 | consumed samples: 152560 | elapsed time per iteration (ms): 16326.4 | learning rate: 4.223E-05 | global batch size: 64 | lm loss: 6.404251E+00 | loss scale: 4096.0 | grad norm: 79746.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5584/ 159576 | consumed samples: 152624 | elapsed time per iteration (ms): 16429.7 | learning rate: 4.225E-05 | global batch size: 64 | lm loss: 6.470784E+00 | loss scale: 4096.0 | grad norm: 78255.053 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5585/ 159576 | consumed samples: 152688 | elapsed time per iteration (ms): 16577.7 | learning rate: 4.227E-05 | global batch size: 64 | lm loss: 6.352365E+00 | loss scale: 4096.0 | grad norm: 85894.611 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5586/ 159576 | consumed samples: 152752 | elapsed time per iteration (ms): 16409.6 | learning rate: 4.228E-05 | global batch size: 64 | lm loss: 6.367690E+00 | loss scale: 4096.0 | grad norm: 268686.463 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5587/ 159576 | consumed samples: 152816 | elapsed time per iteration (ms): 16393.7 | learning rate: 4.230E-05 | global batch size: 64 | lm loss: 6.334382E+00 | loss scale: 4096.0 | grad norm: 92996.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5588/ 159576 | consumed samples: 152880 | elapsed time per iteration (ms): 16647.8 | learning rate: 4.232E-05 | global batch size: 64 | lm loss: 6.174354E+00 | loss scale: 4096.0 | grad norm: 99570.185 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5589/ 159576 | consumed samples: 152944 | elapsed time per iteration (ms): 16470.5 | learning rate: 4.234E-05 | global batch size: 64 | lm loss: 6.349049E+00 | loss scale: 4096.0 | grad norm: 74523.491 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5590/ 159576 | consumed samples: 153008 | elapsed time per iteration (ms): 16348.7 | learning rate: 4.236E-05 | global batch size: 64 | lm loss: 6.388356E+00 | loss scale: 4096.0 | grad norm: 57623.843 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5591/ 159576 | consumed samples: 153072 | elapsed time per iteration (ms): 16338.9 | learning rate: 4.237E-05 | global batch size: 64 | lm loss: 6.399694E+00 | loss scale: 4096.0 | grad norm: 75852.068 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5592/ 159576 | consumed samples: 153136 | elapsed time per iteration (ms): 16704.7 | learning rate: 4.239E-05 | global batch size: 64 | lm loss: 6.327959E+00 | loss scale: 4096.0 | grad norm: 69452.758 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5593/ 159576 | consumed samples: 153200 | elapsed time per iteration (ms): 16334.3 | learning rate: 4.241E-05 | global batch size: 64 | lm loss: 6.435533E+00 | loss scale: 4096.0 | grad norm: 111529.645 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5594/ 159576 | consumed samples: 153264 | elapsed time per iteration (ms): 16385.3 | learning rate: 4.243E-05 | global batch size: 64 | lm loss: 6.438297E+00 | loss scale: 4096.0 | grad norm: 154695.606 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5595/ 159576 | consumed samples: 153328 | elapsed time per iteration (ms): 16343.1 | learning rate: 4.244E-05 | global batch size: 64 | lm loss: 6.431480E+00 | loss scale: 4096.0 | grad norm: 133987.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5596/ 159576 | consumed samples: 153392 | elapsed time per iteration (ms): 16571.5 | learning rate: 4.246E-05 | global batch size: 64 | lm loss: 6.326744E+00 | loss scale: 4096.0 | grad norm: 65072.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5597/ 159576 | consumed samples: 153456 | elapsed time per iteration (ms): 16304.0 | learning rate: 4.248E-05 | global batch size: 64 | lm loss: 6.450805E+00 | loss scale: 4096.0 | grad norm: 67613.081 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5598/ 159576 | consumed samples: 153520 | elapsed time per iteration (ms): 16343.8 | learning rate: 4.250E-05 | global batch size: 64 | lm loss: 6.327376E+00 | loss scale: 4096.0 | grad norm: 77614.563 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5599/ 159576 | consumed samples: 153584 | elapsed time per iteration (ms): 16672.4 | learning rate: 4.251E-05 | global batch size: 64 | lm loss: 6.502485E+00 | loss scale: 4096.0 | grad norm: 97568.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5600/ 159576 | consumed samples: 153648 | elapsed time per iteration (ms): 16410.3 | learning rate: 4.253E-05 | global batch size: 64 | lm loss: 6.429380E+00 | loss scale: 4096.0 | grad norm: 84231.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5601/ 159576 | consumed samples: 153712 | elapsed time per iteration (ms): 16391.0 | learning rate: 4.255E-05 | global batch size: 64 | lm loss: 6.436201E+00 | loss scale: 4096.0 | grad norm: 63319.618 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5602/ 159576 | consumed samples: 153776 | elapsed time per iteration (ms): 16453.8 | learning rate: 4.257E-05 | global batch size: 64 | lm loss: 6.263167E+00 | loss scale: 4096.0 | grad norm: 71392.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5603/ 159576 | consumed samples: 153840 | elapsed time per iteration (ms): 16775.3 | learning rate: 4.259E-05 | global batch size: 64 | lm loss: 6.413259E+00 | loss scale: 4096.0 | grad norm: 123761.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5604/ 159576 | consumed samples: 153904 | elapsed time per iteration (ms): 16504.7 | learning rate: 4.260E-05 | global batch size: 64 | lm loss: 6.544505E+00 | loss scale: 4096.0 | grad norm: 83624.860 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5605/ 159576 | consumed samples: 153968 | elapsed time per iteration (ms): 16306.6 | learning rate: 4.262E-05 | global batch size: 64 | lm loss: 6.452788E+00 | loss scale: 8192.0 | grad norm: 65011.271 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5606/ 159576 | consumed samples: 154032 | elapsed time per iteration (ms): 16378.4 | learning rate: 4.264E-05 | global batch size: 64 | lm loss: 6.422714E+00 | loss scale: 8192.0 | grad norm: 246798.721 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5607/ 159576 | consumed samples: 154096 | elapsed time per iteration (ms): 16552.8 | learning rate: 4.266E-05 | global batch size: 64 | lm loss: 6.375990E+00 | loss scale: 8192.0 | grad norm: 169739.944 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5608/ 159576 | consumed samples: 154160 | elapsed time per iteration (ms): 16382.8 | learning rate: 4.267E-05 | global batch size: 64 | lm loss: 6.358736E+00 | loss scale: 8192.0 | grad norm: 157950.735 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5609/ 159576 | consumed samples: 154224 | elapsed time per iteration (ms): 16422.0 | learning rate: 4.269E-05 | global batch size: 64 | lm loss: 6.444921E+00 | loss scale: 8192.0 | grad norm: 125911.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5610/ 159576 | consumed samples: 154288 | elapsed time per iteration (ms): 9561.0 | learning rate: 4.269E-05 | global batch size: 64 | lm loss: 6.367582E+00 | loss scale: 8192.0 | grad norm: 125911.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5611/ 159576 | consumed samples: 154352 | elapsed time per iteration (ms): 16020.4 | learning rate: 4.271E-05 | global batch size: 64 | lm loss: 6.341266E+00 | loss scale: 8192.0 | grad norm: 196277.090 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5612/ 159576 | consumed samples: 154416 | elapsed time per iteration (ms): 16411.4 | learning rate: 4.273E-05 | global batch size: 64 | lm loss: 6.386235E+00 | loss scale: 8192.0 | grad norm: 174236.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5613/ 159576 | consumed samples: 154480 | elapsed time per iteration (ms): 16406.8 | learning rate: 4.275E-05 | global batch size: 64 | lm loss: 6.302393E+00 | loss scale: 8192.0 | grad norm: 159949.232 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5614/ 159576 | consumed samples: 154544 | elapsed time per iteration (ms): 16823.0 | learning rate: 4.276E-05 | global batch size: 64 | lm loss: 6.427998E+00 | loss scale: 8192.0 | grad norm: 139822.570 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5615/ 159576 | consumed samples: 154608 | elapsed time per iteration (ms): 16523.9 | learning rate: 4.278E-05 | global batch size: 64 | lm loss: 6.437964E+00 | loss scale: 8192.0 | grad norm: 148561.538 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5616/ 159576 | consumed samples: 154672 | elapsed time per iteration (ms): 16444.1 | learning rate: 4.280E-05 | global batch size: 64 | lm loss: 6.387279E+00 | loss scale: 8192.0 | grad norm: 165172.047 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5617/ 159576 | consumed samples: 154736 | elapsed time per iteration (ms): 16455.6 | learning rate: 4.282E-05 | global batch size: 64 | lm loss: 6.365323E+00 | loss scale: 8192.0 | grad norm: 139740.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5618/ 159576 | consumed samples: 154800 | elapsed time per iteration (ms): 16876.6 | learning rate: 4.283E-05 | global batch size: 64 | lm loss: 6.405371E+00 | loss scale: 8192.0 | grad norm: 191865.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5619/ 159576 | consumed samples: 154864 | elapsed time per iteration (ms): 16465.6 | learning rate: 4.285E-05 | global batch size: 64 | lm loss: 6.400004E+00 | loss scale: 8192.0 | grad norm: 131301.224 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5620/ 159576 | consumed samples: 154928 | elapsed time per iteration (ms): 16407.9 | learning rate: 4.287E-05 | global batch size: 64 | lm loss: 6.424757E+00 | loss scale: 8192.0 | grad norm: 152162.206 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5621/ 159576 | consumed samples: 154992 | elapsed time per iteration (ms): 16429.7 | learning rate: 4.289E-05 | global batch size: 64 | lm loss: 6.415905E+00 | loss scale: 8192.0 | grad norm: 184054.677 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5622/ 159576 | consumed samples: 155056 | elapsed time per iteration (ms): 16685.6 | learning rate: 4.291E-05 | global batch size: 64 | lm loss: 6.440601E+00 | loss scale: 8192.0 | grad norm: 290641.293 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5623/ 159576 | consumed samples: 155120 | elapsed time per iteration (ms): 16500.9 | learning rate: 4.292E-05 | global batch size: 64 | lm loss: 6.392663E+00 | loss scale: 8192.0 | grad norm: 151394.636 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5624/ 159576 | consumed samples: 155184 | elapsed time per iteration (ms): 16485.6 | learning rate: 4.294E-05 | global batch size: 64 | lm loss: 6.440325E+00 | loss scale: 8192.0 | grad norm: 132735.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5625/ 159576 | consumed samples: 155248 | elapsed time per iteration (ms): 16832.2 | learning rate: 4.296E-05 | global batch size: 64 | lm loss: 6.382560E+00 | loss scale: 8192.0 | grad norm: 167706.666 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5626/ 159576 | consumed samples: 155312 | elapsed time per iteration (ms): 16294.5 | learning rate: 4.298E-05 | global batch size: 64 | lm loss: 6.422318E+00 | loss scale: 8192.0 | grad norm: 144671.305 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5627/ 159576 | consumed samples: 155376 | elapsed time per iteration (ms): 16433.6 | learning rate: 4.299E-05 | global batch size: 64 | lm loss: 6.400022E+00 | loss scale: 8192.0 | grad norm: 174837.579 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5628/ 159576 | consumed samples: 155440 | elapsed time per iteration (ms): 16385.0 | learning rate: 4.301E-05 | global batch size: 64 | lm loss: 6.465958E+00 | loss scale: 8192.0 | grad norm: 167317.809 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5629/ 159576 | consumed samples: 155504 | elapsed time per iteration (ms): 16829.3 | learning rate: 4.303E-05 | global batch size: 64 | lm loss: 6.365539E+00 | loss scale: 8192.0 | grad norm: 150073.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5630/ 159576 | consumed samples: 155568 | elapsed time per iteration (ms): 16533.0 | learning rate: 4.305E-05 | global batch size: 64 | lm loss: 6.385098E+00 | loss scale: 8192.0 | grad norm: 132923.540 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5631/ 159576 | consumed samples: 155632 | elapsed time per iteration (ms): 16451.7 | learning rate: 4.307E-05 | global batch size: 64 | lm loss: 6.314290E+00 | loss scale: 8192.0 | grad norm: 178222.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5632/ 159576 | consumed samples: 155696 | elapsed time per iteration (ms): 16400.8 | learning rate: 4.308E-05 | global batch size: 64 | lm loss: 6.467572E+00 | loss scale: 8192.0 | grad norm: 147545.253 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5633/ 159576 | consumed samples: 155760 | elapsed time per iteration (ms): 16566.1 | learning rate: 4.310E-05 | global batch size: 64 | lm loss: 6.341013E+00 | loss scale: 8192.0 | grad norm: 200712.657 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5634/ 159576 | consumed samples: 155824 | elapsed time per iteration (ms): 16393.9 | learning rate: 4.312E-05 | global batch size: 64 | lm loss: 6.319093E+00 | loss scale: 8192.0 | grad norm: 161666.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5635/ 159576 | consumed samples: 155888 | elapsed time per iteration (ms): 16416.9 | learning rate: 4.314E-05 | global batch size: 64 | lm loss: 6.461274E+00 | loss scale: 8192.0 | grad norm: 572124.260 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5636/ 159576 | consumed samples: 155952 | elapsed time per iteration (ms): 16756.4 | learning rate: 4.315E-05 | global batch size: 64 | lm loss: 6.453969E+00 | loss scale: 8192.0 | grad norm: 205582.781 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5637/ 159576 | consumed samples: 156016 | elapsed time per iteration (ms): 16349.2 | learning rate: 4.317E-05 | global batch size: 64 | lm loss: 6.386354E+00 | loss scale: 8192.0 | grad norm: 188662.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5638/ 159576 | consumed samples: 156080 | elapsed time per iteration (ms): 16437.2 | learning rate: 4.319E-05 | global batch size: 64 | lm loss: 6.458478E+00 | loss scale: 8192.0 | grad norm: 208129.298 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5639/ 159576 | consumed samples: 156144 | elapsed time per iteration (ms): 16478.4 | learning rate: 4.321E-05 | global batch size: 64 | lm loss: 6.361111E+00 | loss scale: 8192.0 | grad norm: 383224.074 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5640/ 159576 | consumed samples: 156208 | elapsed time per iteration (ms): 16543.3 | learning rate: 4.322E-05 | global batch size: 64 | lm loss: 6.470639E+00 | loss scale: 8192.0 | grad norm: 244281.048 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5641/ 159576 | consumed samples: 156272 | elapsed time per iteration (ms): 16418.6 | learning rate: 4.324E-05 | global batch size: 64 | lm loss: 6.453573E+00 | loss scale: 8192.0 | grad norm: 246555.042 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5642/ 159576 | consumed samples: 156336 | elapsed time per iteration (ms): 16347.0 | learning rate: 4.326E-05 | global batch size: 64 | lm loss: 6.416644E+00 | loss scale: 8192.0 | grad norm: 177394.161 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5643/ 159576 | consumed samples: 156400 | elapsed time per iteration (ms): 9564.0 | learning rate: 4.326E-05 | global batch size: 64 | lm loss: 6.433064E+00 | loss scale: 4096.0 | grad norm: 177394.161 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5644/ 159576 | consumed samples: 156464 | elapsed time per iteration (ms): 16246.5 | learning rate: 4.328E-05 | global batch size: 64 | lm loss: 6.334921E+00 | loss scale: 4096.0 | grad norm: 91031.712 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5645/ 159576 | consumed samples: 156528 | elapsed time per iteration (ms): 16410.8 | learning rate: 4.330E-05 | global batch size: 64 | lm loss: 6.398224E+00 | loss scale: 4096.0 | grad norm: 82899.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5646/ 159576 | consumed samples: 156592 | elapsed time per iteration (ms): 16332.5 | learning rate: 4.331E-05 | global batch size: 64 | lm loss: 6.469447E+00 | loss scale: 4096.0 | grad norm: 93235.700 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5647/ 159576 | consumed samples: 156656 | elapsed time per iteration (ms): 16380.9 | learning rate: 4.333E-05 | global batch size: 64 | lm loss: 6.414939E+00 | loss scale: 4096.0 | grad norm: 98498.938 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5648/ 159576 | consumed samples: 156720 | elapsed time per iteration (ms): 16453.9 | learning rate: 4.335E-05 | global batch size: 64 | lm loss: 6.435335E+00 | loss scale: 4096.0 | grad norm: 110431.475 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5649/ 159576 | consumed samples: 156784 | elapsed time per iteration (ms): 16375.1 | learning rate: 4.337E-05 | global batch size: 64 | lm loss: 6.367991E+00 | loss scale: 4096.0 | grad norm: 112025.804 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5650/ 159576 | consumed samples: 156848 | elapsed time per iteration (ms): 16396.5 | learning rate: 4.338E-05 | global batch size: 64 | lm loss: 6.453450E+00 | loss scale: 4096.0 | grad norm: 142538.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5651/ 159576 | consumed samples: 156912 | elapsed time per iteration (ms): 16662.1 | learning rate: 4.340E-05 | global batch size: 64 | lm loss: 6.376512E+00 | loss scale: 4096.0 | grad norm: 104884.454 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5652/ 159576 | consumed samples: 156976 | elapsed time per iteration (ms): 16397.7 | learning rate: 4.342E-05 | global batch size: 64 | lm loss: 6.398083E+00 | loss scale: 4096.0 | grad norm: 97434.412 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5653/ 159576 | consumed samples: 157040 | elapsed time per iteration (ms): 16367.3 | learning rate: 4.344E-05 | global batch size: 64 | lm loss: 6.468301E+00 | loss scale: 4096.0 | grad norm: 189503.731 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5654/ 159576 | consumed samples: 157104 | elapsed time per iteration (ms): 16332.7 | learning rate: 4.346E-05 | global batch size: 64 | lm loss: 6.449702E+00 | loss scale: 4096.0 | grad norm: 101635.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5655/ 159576 | consumed samples: 157168 | elapsed time per iteration (ms): 16814.3 | learning rate: 4.347E-05 | global batch size: 64 | lm loss: 6.417078E+00 | loss scale: 4096.0 | grad norm: 163445.588 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5656/ 159576 | consumed samples: 157232 | elapsed time per iteration (ms): 16304.4 | learning rate: 4.349E-05 | global batch size: 64 | lm loss: 6.445296E+00 | loss scale: 4096.0 | grad norm: 90409.939 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5657/ 159576 | consumed samples: 157296 | elapsed time per iteration (ms): 16400.9 | learning rate: 4.351E-05 | global batch size: 64 | lm loss: 6.445564E+00 | loss scale: 4096.0 | grad norm: 81513.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5658/ 159576 | consumed samples: 157360 | elapsed time per iteration (ms): 16340.5 | learning rate: 4.353E-05 | global batch size: 64 | lm loss: 6.333720E+00 | loss scale: 4096.0 | grad norm: 134428.283 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5659/ 159576 | consumed samples: 157424 | elapsed time per iteration (ms): 16553.5 | learning rate: 4.354E-05 | global batch size: 64 | lm loss: 6.401426E+00 | loss scale: 4096.0 | grad norm: 106022.946 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5660/ 159576 | consumed samples: 157488 | elapsed time per iteration (ms): 16387.3 | learning rate: 4.356E-05 | global batch size: 64 | lm loss: 6.438997E+00 | loss scale: 4096.0 | grad norm: 83955.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5661/ 159576 | consumed samples: 157552 | elapsed time per iteration (ms): 16456.3 | learning rate: 4.358E-05 | global batch size: 64 | lm loss: 6.402083E+00 | loss scale: 4096.0 | grad norm: 85068.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5662/ 159576 | consumed samples: 157616 | elapsed time per iteration (ms): 16696.8 | learning rate: 4.360E-05 | global batch size: 64 | lm loss: 6.441435E+00 | loss scale: 4096.0 | grad norm: 101578.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5663/ 159576 | consumed samples: 157680 | elapsed time per iteration (ms): 16497.3 | learning rate: 4.362E-05 | global batch size: 64 | lm loss: 6.405056E+00 | loss scale: 4096.0 | grad norm: 90814.200 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5664/ 159576 | consumed samples: 157744 | elapsed time per iteration (ms): 16393.8 | learning rate: 4.363E-05 | global batch size: 64 | lm loss: 6.437488E+00 | loss scale: 4096.0 | grad norm: 99258.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5665/ 159576 | consumed samples: 157808 | elapsed time per iteration (ms): 16464.8 | learning rate: 4.365E-05 | global batch size: 64 | lm loss: 6.461691E+00 | loss scale: 4096.0 | grad norm: 150615.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5666/ 159576 | consumed samples: 157872 | elapsed time per iteration (ms): 16442.6 | learning rate: 4.367E-05 | global batch size: 64 | lm loss: 6.379485E+00 | loss scale: 4096.0 | grad norm: 87553.112 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5667/ 159576 | consumed samples: 157936 | elapsed time per iteration (ms): 16408.0 | learning rate: 4.369E-05 | global batch size: 64 | lm loss: 6.436778E+00 | loss scale: 4096.0 | grad norm: 86837.058 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5668/ 159576 | consumed samples: 158000 | elapsed time per iteration (ms): 16382.6 | learning rate: 4.370E-05 | global batch size: 64 | lm loss: 6.456222E+00 | loss scale: 4096.0 | grad norm: 81561.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5669/ 159576 | consumed samples: 158064 | elapsed time per iteration (ms): 16606.9 | learning rate: 4.372E-05 | global batch size: 64 | lm loss: 6.394565E+00 | loss scale: 4096.0 | grad norm: 90655.669 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5670/ 159576 | consumed samples: 158128 | elapsed time per iteration (ms): 16482.0 | learning rate: 4.374E-05 | global batch size: 64 | lm loss: 6.388999E+00 | loss scale: 4096.0 | grad norm: 139861.145 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5671/ 159576 | consumed samples: 158192 | elapsed time per iteration (ms): 16430.2 | learning rate: 4.376E-05 | global batch size: 64 | lm loss: 6.348672E+00 | loss scale: 4096.0 | grad norm: 79933.242 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5672/ 159576 | consumed samples: 158256 | elapsed time per iteration (ms): 16343.5 | learning rate: 4.378E-05 | global batch size: 64 | lm loss: 6.358377E+00 | loss scale: 4096.0 | grad norm: 91907.327 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5673/ 159576 | consumed samples: 158320 | elapsed time per iteration (ms): 16738.6 | learning rate: 4.379E-05 | global batch size: 64 | lm loss: 6.397278E+00 | loss scale: 4096.0 | grad norm: 81347.015 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5674/ 159576 | consumed samples: 158384 | elapsed time per iteration (ms): 16377.1 | learning rate: 4.381E-05 | global batch size: 64 | lm loss: 6.330511E+00 | loss scale: 4096.0 | grad norm: 87623.840 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5675/ 159576 | consumed samples: 158448 | elapsed time per iteration (ms): 16376.8 | learning rate: 4.383E-05 | global batch size: 64 | lm loss: 6.400737E+00 | loss scale: 4096.0 | grad norm: 86243.502 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5676/ 159576 | consumed samples: 158512 | elapsed time per iteration (ms): 16407.2 | learning rate: 4.385E-05 | global batch size: 64 | lm loss: 6.373343E+00 | loss scale: 4096.0 | grad norm: 112233.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5677/ 159576 | consumed samples: 158576 | elapsed time per iteration (ms): 16504.3 | learning rate: 4.386E-05 | global batch size: 64 | lm loss: 6.340403E+00 | loss scale: 4096.0 | grad norm: 87545.481 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5678/ 159576 | consumed samples: 158640 | elapsed time per iteration (ms): 16469.6 | learning rate: 4.388E-05 | global batch size: 64 | lm loss: 6.483582E+00 | loss scale: 4096.0 | grad norm: 85898.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5679/ 159576 | consumed samples: 158704 | elapsed time per iteration (ms): 16363.2 | learning rate: 4.390E-05 | global batch size: 64 | lm loss: 6.384809E+00 | loss scale: 4096.0 | grad norm: 75822.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5680/ 159576 | consumed samples: 158768 | elapsed time per iteration (ms): 16705.5 | learning rate: 4.392E-05 | global batch size: 64 | lm loss: 6.360985E+00 | loss scale: 4096.0 | grad norm: 93411.572 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5681/ 159576 | consumed samples: 158832 | elapsed time per iteration (ms): 16533.6 | learning rate: 4.393E-05 | global batch size: 64 | lm loss: 6.346332E+00 | loss scale: 4096.0 | grad norm: 98347.186 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5682/ 159576 | consumed samples: 158896 | elapsed time per iteration (ms): 16424.8 | learning rate: 4.395E-05 | global batch size: 64 | lm loss: 6.452760E+00 | loss scale: 4096.0 | grad norm: 113842.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5683/ 159576 | consumed samples: 158960 | elapsed time per iteration (ms): 16412.1 | learning rate: 4.397E-05 | global batch size: 64 | lm loss: 6.394449E+00 | loss scale: 4096.0 | grad norm: 225192.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5684/ 159576 | consumed samples: 159024 | elapsed time per iteration (ms): 16934.4 | learning rate: 4.399E-05 | global batch size: 64 | lm loss: 6.394941E+00 | loss scale: 4096.0 | grad norm: 81396.577 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5685/ 159576 | consumed samples: 159088 | elapsed time per iteration (ms): 16454.0 | learning rate: 4.401E-05 | global batch size: 64 | lm loss: 6.261321E+00 | loss scale: 4096.0 | grad norm: 86149.759 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5686/ 159576 | consumed samples: 159152 | elapsed time per iteration (ms): 16431.5 | learning rate: 4.402E-05 | global batch size: 64 | lm loss: 6.492159E+00 | loss scale: 4096.0 | grad norm: 119300.666 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5687/ 159576 | consumed samples: 159216 | elapsed time per iteration (ms): 16386.6 | learning rate: 4.404E-05 | global batch size: 64 | lm loss: 6.511878E+00 | loss scale: 4096.0 | grad norm: 91338.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5688/ 159576 | consumed samples: 159280 | elapsed time per iteration (ms): 16584.3 | learning rate: 4.406E-05 | global batch size: 64 | lm loss: 6.442091E+00 | loss scale: 4096.0 | grad norm: 127329.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5689/ 159576 | consumed samples: 159344 | elapsed time per iteration (ms): 16414.9 | learning rate: 4.408E-05 | global batch size: 64 | lm loss: 6.445393E+00 | loss scale: 4096.0 | grad norm: 74818.326 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5690/ 159576 | consumed samples: 159408 | elapsed time per iteration (ms): 16438.8 | learning rate: 4.409E-05 | global batch size: 64 | lm loss: 6.349149E+00 | loss scale: 4096.0 | grad norm: 90721.765 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5691/ 159576 | consumed samples: 159472 | elapsed time per iteration (ms): 16762.3 | learning rate: 4.411E-05 | global batch size: 64 | lm loss: 6.450273E+00 | loss scale: 4096.0 | grad norm: 84948.864 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5692/ 159576 | consumed samples: 159536 | elapsed time per iteration (ms): 16461.8 | learning rate: 4.413E-05 | global batch size: 64 | lm loss: 6.451497E+00 | loss scale: 4096.0 | grad norm: 160376.410 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5693/ 159576 | consumed samples: 159600 | elapsed time per iteration (ms): 16376.8 | learning rate: 4.415E-05 | global batch size: 64 | lm loss: 6.414182E+00 | loss scale: 4096.0 | grad norm: 64931.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5694/ 159576 | consumed samples: 159664 | elapsed time per iteration (ms): 16448.9 | learning rate: 4.417E-05 | global batch size: 64 | lm loss: 6.392116E+00 | loss scale: 4096.0 | grad norm: 82604.441 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5695/ 159576 | consumed samples: 159728 | elapsed time per iteration (ms): 16621.3 | learning rate: 4.418E-05 | global batch size: 64 | lm loss: 6.379553E+00 | loss scale: 4096.0 | grad norm: 96286.790 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5696/ 159576 | consumed samples: 159792 | elapsed time per iteration (ms): 16447.4 | learning rate: 4.420E-05 | global batch size: 64 | lm loss: 6.319911E+00 | loss scale: 4096.0 | grad norm: 113489.439 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5697/ 159576 | consumed samples: 159856 | elapsed time per iteration (ms): 16402.2 | learning rate: 4.422E-05 | global batch size: 64 | lm loss: 6.538674E+00 | loss scale: 4096.0 | grad norm: 88380.285 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5698/ 159576 | consumed samples: 159920 | elapsed time per iteration (ms): 16462.3 | learning rate: 4.424E-05 | global batch size: 64 | lm loss: 6.450464E+00 | loss scale: 4096.0 | grad norm: 86213.288 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5699/ 159576 | consumed samples: 159984 | elapsed time per iteration (ms): 16600.9 | learning rate: 4.425E-05 | global batch size: 64 | lm loss: 6.360521E+00 | loss scale: 4096.0 | grad norm: 102839.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5700/ 159576 | consumed samples: 160048 | elapsed time per iteration (ms): 16472.4 | learning rate: 4.427E-05 | global batch size: 64 | lm loss: 6.408503E+00 | loss scale: 4096.0 | grad norm: 93708.692 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 01:07:57] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition) -[2021-09-25 01:07:57] PULSE: tr8-104B is running for 19:15:46 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8]) - iteration 5701/ 159576 | consumed samples: 160112 | elapsed time per iteration (ms): 16355.6 | learning rate: 4.429E-05 | global batch size: 64 | lm loss: 6.383047E+00 | loss scale: 4096.0 | grad norm: 277390.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5702/ 159576 | consumed samples: 160176 | elapsed time per iteration (ms): 16761.7 | learning rate: 4.431E-05 | global batch size: 64 | lm loss: 6.450840E+00 | loss scale: 4096.0 | grad norm: 91541.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5703/ 159576 | consumed samples: 160240 | elapsed time per iteration (ms): 9560.9 | learning rate: 4.431E-05 | global batch size: 64 | lm loss: 6.493016E+00 | loss scale: 2048.0 | grad norm: 91541.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5704/ 159576 | consumed samples: 160304 | elapsed time per iteration (ms): 15881.2 | learning rate: 4.433E-05 | global batch size: 64 | lm loss: 6.513262E+00 | loss scale: 2048.0 | grad norm: 63292.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5705/ 159576 | consumed samples: 160368 | elapsed time per iteration (ms): 16396.1 | learning rate: 4.434E-05 | global batch size: 64 | lm loss: 6.341697E+00 | loss scale: 2048.0 | grad norm: 49175.756 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5706/ 159576 | consumed samples: 160432 | elapsed time per iteration (ms): 16742.1 | learning rate: 4.436E-05 | global batch size: 64 | lm loss: 6.376310E+00 | loss scale: 2048.0 | grad norm: 49500.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5707/ 159576 | consumed samples: 160496 | elapsed time per iteration (ms): 16502.9 | learning rate: 4.438E-05 | global batch size: 64 | lm loss: 6.305195E+00 | loss scale: 2048.0 | grad norm: 66863.451 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5708/ 159576 | consumed samples: 160560 | elapsed time per iteration (ms): 16427.2 | learning rate: 4.440E-05 | global batch size: 64 | lm loss: 6.338213E+00 | loss scale: 2048.0 | grad norm: 49886.489 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5709/ 159576 | consumed samples: 160624 | elapsed time per iteration (ms): 16430.3 | learning rate: 4.441E-05 | global batch size: 64 | lm loss: 6.403567E+00 | loss scale: 2048.0 | grad norm: 67050.774 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5710/ 159576 | consumed samples: 160688 | elapsed time per iteration (ms): 16701.6 | learning rate: 4.443E-05 | global batch size: 64 | lm loss: 6.365169E+00 | loss scale: 2048.0 | grad norm: 65553.235 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5711/ 159576 | consumed samples: 160752 | elapsed time per iteration (ms): 16495.7 | learning rate: 4.445E-05 | global batch size: 64 | lm loss: 6.437389E+00 | loss scale: 2048.0 | grad norm: 42948.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5712/ 159576 | consumed samples: 160816 | elapsed time per iteration (ms): 16396.0 | learning rate: 4.447E-05 | global batch size: 64 | lm loss: 6.359374E+00 | loss scale: 2048.0 | grad norm: 47459.652 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5713/ 159576 | consumed samples: 160880 | elapsed time per iteration (ms): 16399.1 | learning rate: 4.449E-05 | global batch size: 64 | lm loss: 6.384996E+00 | loss scale: 2048.0 | grad norm: 54873.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5714/ 159576 | consumed samples: 160944 | elapsed time per iteration (ms): 16655.8 | learning rate: 4.450E-05 | global batch size: 64 | lm loss: 6.407744E+00 | loss scale: 2048.0 | grad norm: 49484.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5715/ 159576 | consumed samples: 161008 | elapsed time per iteration (ms): 16395.3 | learning rate: 4.452E-05 | global batch size: 64 | lm loss: 6.596529E+00 | loss scale: 2048.0 | grad norm: 56205.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5716/ 159576 | consumed samples: 161072 | elapsed time per iteration (ms): 16464.0 | learning rate: 4.454E-05 | global batch size: 64 | lm loss: 6.421166E+00 | loss scale: 2048.0 | grad norm: 62635.742 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5717/ 159576 | consumed samples: 161136 | elapsed time per iteration (ms): 16725.6 | learning rate: 4.456E-05 | global batch size: 64 | lm loss: 6.470579E+00 | loss scale: 2048.0 | grad norm: 63421.257 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5718/ 159576 | consumed samples: 161200 | elapsed time per iteration (ms): 16562.5 | learning rate: 4.457E-05 | global batch size: 64 | lm loss: 6.431957E+00 | loss scale: 2048.0 | grad norm: 41629.913 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5719/ 159576 | consumed samples: 161264 | elapsed time per iteration (ms): 16447.6 | learning rate: 4.459E-05 | global batch size: 64 | lm loss: 6.372540E+00 | loss scale: 2048.0 | grad norm: 52749.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5720/ 159576 | consumed samples: 161328 | elapsed time per iteration (ms): 16436.0 | learning rate: 4.461E-05 | global batch size: 64 | lm loss: 6.376571E+00 | loss scale: 2048.0 | grad norm: 152378.164 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5721/ 159576 | consumed samples: 161392 | elapsed time per iteration (ms): 16522.7 | learning rate: 4.463E-05 | global batch size: 64 | lm loss: 6.346034E+00 | loss scale: 2048.0 | grad norm: 79170.187 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5722/ 159576 | consumed samples: 161456 | elapsed time per iteration (ms): 16447.7 | learning rate: 4.464E-05 | global batch size: 64 | lm loss: 6.379195E+00 | loss scale: 2048.0 | grad norm: 54035.991 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5723/ 159576 | consumed samples: 161520 | elapsed time per iteration (ms): 16383.8 | learning rate: 4.466E-05 | global batch size: 64 | lm loss: 6.410875E+00 | loss scale: 2048.0 | grad norm: 122622.327 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5724/ 159576 | consumed samples: 161584 | elapsed time per iteration (ms): 16762.9 | learning rate: 4.468E-05 | global batch size: 64 | lm loss: 6.426128E+00 | loss scale: 2048.0 | grad norm: 61346.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5725/ 159576 | consumed samples: 161648 | elapsed time per iteration (ms): 16455.6 | learning rate: 4.470E-05 | global batch size: 64 | lm loss: 6.440339E+00 | loss scale: 2048.0 | grad norm: 114917.377 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5726/ 159576 | consumed samples: 161712 | elapsed time per iteration (ms): 16491.5 | learning rate: 4.472E-05 | global batch size: 64 | lm loss: 6.229801E+00 | loss scale: 2048.0 | grad norm: 43861.570 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5727/ 159576 | consumed samples: 161776 | elapsed time per iteration (ms): 16434.9 | learning rate: 4.473E-05 | global batch size: 64 | lm loss: 6.503794E+00 | loss scale: 2048.0 | grad norm: 59176.822 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5728/ 159576 | consumed samples: 161840 | elapsed time per iteration (ms): 16686.0 | learning rate: 4.475E-05 | global batch size: 64 | lm loss: 6.415756E+00 | loss scale: 2048.0 | grad norm: 62124.293 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5729/ 159576 | consumed samples: 161904 | elapsed time per iteration (ms): 16403.6 | learning rate: 4.477E-05 | global batch size: 64 | lm loss: 6.457495E+00 | loss scale: 2048.0 | grad norm: 56507.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5730/ 159576 | consumed samples: 161968 | elapsed time per iteration (ms): 16426.6 | learning rate: 4.479E-05 | global batch size: 64 | lm loss: 6.469141E+00 | loss scale: 2048.0 | grad norm: 61746.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5731/ 159576 | consumed samples: 162032 | elapsed time per iteration (ms): 16455.5 | learning rate: 4.480E-05 | global batch size: 64 | lm loss: 6.459309E+00 | loss scale: 2048.0 | grad norm: 59449.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5732/ 159576 | consumed samples: 162096 | elapsed time per iteration (ms): 16649.1 | learning rate: 4.482E-05 | global batch size: 64 | lm loss: 6.402276E+00 | loss scale: 2048.0 | grad norm: 46335.687 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5733/ 159576 | consumed samples: 162160 | elapsed time per iteration (ms): 16461.8 | learning rate: 4.484E-05 | global batch size: 64 | lm loss: 6.519283E+00 | loss scale: 2048.0 | grad norm: 66042.113 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5734/ 159576 | consumed samples: 162224 | elapsed time per iteration (ms): 16320.8 | learning rate: 4.486E-05 | global batch size: 64 | lm loss: 6.357197E+00 | loss scale: 2048.0 | grad norm: 86317.077 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5735/ 159576 | consumed samples: 162288 | elapsed time per iteration (ms): 16817.7 | learning rate: 4.488E-05 | global batch size: 64 | lm loss: 6.412820E+00 | loss scale: 2048.0 | grad norm: 68051.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5736/ 159576 | consumed samples: 162352 | elapsed time per iteration (ms): 16374.0 | learning rate: 4.489E-05 | global batch size: 64 | lm loss: 6.409474E+00 | loss scale: 2048.0 | grad norm: 52474.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5737/ 159576 | consumed samples: 162416 | elapsed time per iteration (ms): 16279.5 | learning rate: 4.491E-05 | global batch size: 64 | lm loss: 6.432059E+00 | loss scale: 2048.0 | grad norm: 60932.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5738/ 159576 | consumed samples: 162480 | elapsed time per iteration (ms): 16405.5 | learning rate: 4.493E-05 | global batch size: 64 | lm loss: 6.389083E+00 | loss scale: 2048.0 | grad norm: 97554.805 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5739/ 159576 | consumed samples: 162544 | elapsed time per iteration (ms): 16881.2 | learning rate: 4.495E-05 | global batch size: 64 | lm loss: 6.352797E+00 | loss scale: 2048.0 | grad norm: 56410.885 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5740/ 159576 | consumed samples: 162608 | elapsed time per iteration (ms): 16465.8 | learning rate: 4.496E-05 | global batch size: 64 | lm loss: 6.400247E+00 | loss scale: 2048.0 | grad norm: 67543.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5741/ 159576 | consumed samples: 162672 | elapsed time per iteration (ms): 16430.8 | learning rate: 4.498E-05 | global batch size: 64 | lm loss: 6.361669E+00 | loss scale: 2048.0 | grad norm: 49133.819 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5742/ 159576 | consumed samples: 162736 | elapsed time per iteration (ms): 16371.1 | learning rate: 4.500E-05 | global batch size: 64 | lm loss: 6.415005E+00 | loss scale: 2048.0 | grad norm: 84089.923 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5743/ 159576 | consumed samples: 162800 | elapsed time per iteration (ms): 16700.6 | learning rate: 4.502E-05 | global batch size: 64 | lm loss: 6.365685E+00 | loss scale: 2048.0 | grad norm: 51630.988 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5744/ 159576 | consumed samples: 162864 | elapsed time per iteration (ms): 16325.3 | learning rate: 4.504E-05 | global batch size: 64 | lm loss: 6.440388E+00 | loss scale: 2048.0 | grad norm: 72309.287 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5745/ 159576 | consumed samples: 162928 | elapsed time per iteration (ms): 16329.9 | learning rate: 4.505E-05 | global batch size: 64 | lm loss: 6.466510E+00 | loss scale: 2048.0 | grad norm: 42690.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5746/ 159576 | consumed samples: 162992 | elapsed time per iteration (ms): 16621.4 | learning rate: 4.507E-05 | global batch size: 64 | lm loss: 6.487222E+00 | loss scale: 2048.0 | grad norm: 71804.170 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5747/ 159576 | consumed samples: 163056 | elapsed time per iteration (ms): 16495.0 | learning rate: 4.509E-05 | global batch size: 64 | lm loss: 6.362286E+00 | loss scale: 2048.0 | grad norm: 86678.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5748/ 159576 | consumed samples: 163120 | elapsed time per iteration (ms): 16346.4 | learning rate: 4.511E-05 | global batch size: 64 | lm loss: 6.356483E+00 | loss scale: 2048.0 | grad norm: 59964.749 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5749/ 159576 | consumed samples: 163184 | elapsed time per iteration (ms): 16441.6 | learning rate: 4.512E-05 | global batch size: 64 | lm loss: 6.417390E+00 | loss scale: 2048.0 | grad norm: 50380.263 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5750/ 159576 | consumed samples: 163248 | elapsed time per iteration (ms): 16658.5 | learning rate: 4.514E-05 | global batch size: 64 | lm loss: 6.274541E+00 | loss scale: 2048.0 | grad norm: 39059.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5751/ 159576 | consumed samples: 163312 | elapsed time per iteration (ms): 16405.5 | learning rate: 4.516E-05 | global batch size: 64 | lm loss: 6.367218E+00 | loss scale: 2048.0 | grad norm: 51183.207 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5752/ 159576 | consumed samples: 163376 | elapsed time per iteration (ms): 16320.2 | learning rate: 4.518E-05 | global batch size: 64 | lm loss: 6.344701E+00 | loss scale: 2048.0 | grad norm: 36962.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5753/ 159576 | consumed samples: 163440 | elapsed time per iteration (ms): 16390.0 | learning rate: 4.520E-05 | global batch size: 64 | lm loss: 6.400953E+00 | loss scale: 2048.0 | grad norm: 66022.407 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5754/ 159576 | consumed samples: 163504 | elapsed time per iteration (ms): 16546.1 | learning rate: 4.521E-05 | global batch size: 64 | lm loss: 6.378292E+00 | loss scale: 2048.0 | grad norm: 51492.219 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5755/ 159576 | consumed samples: 163568 | elapsed time per iteration (ms): 16433.9 | learning rate: 4.523E-05 | global batch size: 64 | lm loss: 6.447009E+00 | loss scale: 2048.0 | grad norm: 67150.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5756/ 159576 | consumed samples: 163632 | elapsed time per iteration (ms): 16359.3 | learning rate: 4.525E-05 | global batch size: 64 | lm loss: 6.393310E+00 | loss scale: 2048.0 | grad norm: 47124.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5757/ 159576 | consumed samples: 163696 | elapsed time per iteration (ms): 16714.1 | learning rate: 4.527E-05 | global batch size: 64 | lm loss: 6.428847E+00 | loss scale: 2048.0 | grad norm: 73984.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5758/ 159576 | consumed samples: 163760 | elapsed time per iteration (ms): 16285.5 | learning rate: 4.528E-05 | global batch size: 64 | lm loss: 6.410369E+00 | loss scale: 2048.0 | grad norm: 51894.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5759/ 159576 | consumed samples: 163824 | elapsed time per iteration (ms): 16346.5 | learning rate: 4.530E-05 | global batch size: 64 | lm loss: 6.361977E+00 | loss scale: 2048.0 | grad norm: 46022.549 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5760/ 159576 | consumed samples: 163888 | elapsed time per iteration (ms): 16363.4 | learning rate: 4.532E-05 | global batch size: 64 | lm loss: 6.411450E+00 | loss scale: 2048.0 | grad norm: 62804.958 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5761/ 159576 | consumed samples: 163952 | elapsed time per iteration (ms): 16576.6 | learning rate: 4.534E-05 | global batch size: 64 | lm loss: 6.492290E+00 | loss scale: 2048.0 | grad norm: 91376.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5762/ 159576 | consumed samples: 164016 | elapsed time per iteration (ms): 16429.0 | learning rate: 4.536E-05 | global batch size: 64 | lm loss: 6.351690E+00 | loss scale: 2048.0 | grad norm: 56460.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5763/ 159576 | consumed samples: 164080 | elapsed time per iteration (ms): 16419.8 | learning rate: 4.537E-05 | global batch size: 64 | lm loss: 6.388021E+00 | loss scale: 2048.0 | grad norm: 48184.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5764/ 159576 | consumed samples: 164144 | elapsed time per iteration (ms): 16346.0 | learning rate: 4.539E-05 | global batch size: 64 | lm loss: 6.500803E+00 | loss scale: 2048.0 | grad norm: 47702.715 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5765/ 159576 | consumed samples: 164208 | elapsed time per iteration (ms): 16601.8 | learning rate: 4.541E-05 | global batch size: 64 | lm loss: 6.377601E+00 | loss scale: 2048.0 | grad norm: 52558.168 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5766/ 159576 | consumed samples: 164272 | elapsed time per iteration (ms): 16306.8 | learning rate: 4.543E-05 | global batch size: 64 | lm loss: 6.348913E+00 | loss scale: 2048.0 | grad norm: 75335.243 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5767/ 159576 | consumed samples: 164336 | elapsed time per iteration (ms): 16391.8 | learning rate: 4.544E-05 | global batch size: 64 | lm loss: 6.287434E+00 | loss scale: 2048.0 | grad norm: 51886.097 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5768/ 159576 | consumed samples: 164400 | elapsed time per iteration (ms): 16644.5 | learning rate: 4.546E-05 | global batch size: 64 | lm loss: 6.409395E+00 | loss scale: 2048.0 | grad norm: 59368.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5769/ 159576 | consumed samples: 164464 | elapsed time per iteration (ms): 16355.1 | learning rate: 4.548E-05 | global batch size: 64 | lm loss: 6.376360E+00 | loss scale: 2048.0 | grad norm: 45775.427 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5770/ 159576 | consumed samples: 164528 | elapsed time per iteration (ms): 16317.3 | learning rate: 4.550E-05 | global batch size: 64 | lm loss: 6.428416E+00 | loss scale: 2048.0 | grad norm: 53234.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5771/ 159576 | consumed samples: 164592 | elapsed time per iteration (ms): 16327.7 | learning rate: 4.551E-05 | global batch size: 64 | lm loss: 6.374567E+00 | loss scale: 2048.0 | grad norm: 44963.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5772/ 159576 | consumed samples: 164656 | elapsed time per iteration (ms): 16674.7 | learning rate: 4.553E-05 | global batch size: 64 | lm loss: 6.357097E+00 | loss scale: 2048.0 | grad norm: 47484.231 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5773/ 159576 | consumed samples: 164720 | elapsed time per iteration (ms): 16463.9 | learning rate: 4.555E-05 | global batch size: 64 | lm loss: 6.398357E+00 | loss scale: 2048.0 | grad norm: 41638.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5774/ 159576 | consumed samples: 164784 | elapsed time per iteration (ms): 16348.7 | learning rate: 4.557E-05 | global batch size: 64 | lm loss: 6.351582E+00 | loss scale: 2048.0 | grad norm: 54903.850 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5775/ 159576 | consumed samples: 164848 | elapsed time per iteration (ms): 16736.5 | learning rate: 4.559E-05 | global batch size: 64 | lm loss: 6.367338E+00 | loss scale: 2048.0 | grad norm: 43171.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5776/ 159576 | consumed samples: 164912 | elapsed time per iteration (ms): 16420.4 | learning rate: 4.560E-05 | global batch size: 64 | lm loss: 6.386267E+00 | loss scale: 2048.0 | grad norm: 68637.095 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5777/ 159576 | consumed samples: 164976 | elapsed time per iteration (ms): 16467.1 | learning rate: 4.562E-05 | global batch size: 64 | lm loss: 6.368368E+00 | loss scale: 2048.0 | grad norm: 47557.345 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5778/ 159576 | consumed samples: 165040 | elapsed time per iteration (ms): 16383.6 | learning rate: 4.564E-05 | global batch size: 64 | lm loss: 6.360928E+00 | loss scale: 2048.0 | grad norm: 48661.439 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5779/ 159576 | consumed samples: 165104 | elapsed time per iteration (ms): 16795.3 | learning rate: 4.566E-05 | global batch size: 64 | lm loss: 6.286585E+00 | loss scale: 2048.0 | grad norm: 41957.074 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5780/ 159576 | consumed samples: 165168 | elapsed time per iteration (ms): 16414.6 | learning rate: 4.567E-05 | global batch size: 64 | lm loss: 6.329445E+00 | loss scale: 2048.0 | grad norm: 58532.760 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5781/ 159576 | consumed samples: 165232 | elapsed time per iteration (ms): 16413.2 | learning rate: 4.569E-05 | global batch size: 64 | lm loss: 6.447413E+00 | loss scale: 2048.0 | grad norm: 58971.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5782/ 159576 | consumed samples: 165296 | elapsed time per iteration (ms): 16345.1 | learning rate: 4.571E-05 | global batch size: 64 | lm loss: 6.367276E+00 | loss scale: 2048.0 | grad norm: 62853.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5783/ 159576 | consumed samples: 165360 | elapsed time per iteration (ms): 16700.8 | learning rate: 4.573E-05 | global batch size: 64 | lm loss: 6.394166E+00 | loss scale: 2048.0 | grad norm: 104426.360 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5784/ 159576 | consumed samples: 165424 | elapsed time per iteration (ms): 16276.5 | learning rate: 4.575E-05 | global batch size: 64 | lm loss: 6.447882E+00 | loss scale: 2048.0 | grad norm: 50564.392 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5785/ 159576 | consumed samples: 165488 | elapsed time per iteration (ms): 16423.7 | learning rate: 4.576E-05 | global batch size: 64 | lm loss: 6.341421E+00 | loss scale: 2048.0 | grad norm: 126331.219 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5786/ 159576 | consumed samples: 165552 | elapsed time per iteration (ms): 16792.0 | learning rate: 4.578E-05 | global batch size: 64 | lm loss: 6.384687E+00 | loss scale: 2048.0 | grad norm: 54058.867 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5787/ 159576 | consumed samples: 165616 | elapsed time per iteration (ms): 16388.2 | learning rate: 4.580E-05 | global batch size: 64 | lm loss: 6.392807E+00 | loss scale: 2048.0 | grad norm: 59371.923 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5788/ 159576 | consumed samples: 165680 | elapsed time per iteration (ms): 16392.6 | learning rate: 4.582E-05 | global batch size: 64 | lm loss: 6.457485E+00 | loss scale: 2048.0 | grad norm: 65736.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5789/ 159576 | consumed samples: 165744 | elapsed time per iteration (ms): 16338.9 | learning rate: 4.583E-05 | global batch size: 64 | lm loss: 6.370594E+00 | loss scale: 2048.0 | grad norm: 86846.852 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5790/ 159576 | consumed samples: 165808 | elapsed time per iteration (ms): 16857.0 | learning rate: 4.585E-05 | global batch size: 64 | lm loss: 6.412526E+00 | loss scale: 2048.0 | grad norm: 77325.810 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5791/ 159576 | consumed samples: 165872 | elapsed time per iteration (ms): 16398.4 | learning rate: 4.587E-05 | global batch size: 64 | lm loss: 6.412295E+00 | loss scale: 2048.0 | grad norm: 50166.463 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5792/ 159576 | consumed samples: 165936 | elapsed time per iteration (ms): 16290.5 | learning rate: 4.589E-05 | global batch size: 64 | lm loss: 6.380277E+00 | loss scale: 2048.0 | grad norm: 48226.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5793/ 159576 | consumed samples: 166000 | elapsed time per iteration (ms): 16371.0 | learning rate: 4.591E-05 | global batch size: 64 | lm loss: 6.359699E+00 | loss scale: 2048.0 | grad norm: 65168.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5794/ 159576 | consumed samples: 166064 | elapsed time per iteration (ms): 16645.3 | learning rate: 4.592E-05 | global batch size: 64 | lm loss: 6.321030E+00 | loss scale: 2048.0 | grad norm: 52186.470 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5795/ 159576 | consumed samples: 166128 | elapsed time per iteration (ms): 16469.4 | learning rate: 4.594E-05 | global batch size: 64 | lm loss: 6.393083E+00 | loss scale: 2048.0 | grad norm: 55272.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5796/ 159576 | consumed samples: 166192 | elapsed time per iteration (ms): 16425.9 | learning rate: 4.596E-05 | global batch size: 64 | lm loss: 6.374780E+00 | loss scale: 2048.0 | grad norm: 53939.279 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5797/ 159576 | consumed samples: 166256 | elapsed time per iteration (ms): 16770.7 | learning rate: 4.598E-05 | global batch size: 64 | lm loss: 6.376060E+00 | loss scale: 2048.0 | grad norm: 62276.052 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5798/ 159576 | consumed samples: 166320 | elapsed time per iteration (ms): 16339.0 | learning rate: 4.599E-05 | global batch size: 64 | lm loss: 6.463357E+00 | loss scale: 2048.0 | grad norm: 55276.460 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5799/ 159576 | consumed samples: 166384 | elapsed time per iteration (ms): 16400.6 | learning rate: 4.601E-05 | global batch size: 64 | lm loss: 6.364144E+00 | loss scale: 2048.0 | grad norm: 46941.317 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5800/ 159576 | consumed samples: 166448 | elapsed time per iteration (ms): 16328.3 | learning rate: 4.603E-05 | global batch size: 64 | lm loss: 6.412081E+00 | loss scale: 2048.0 | grad norm: 61281.255 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5801/ 159576 | consumed samples: 166512 | elapsed time per iteration (ms): 16791.0 | learning rate: 4.605E-05 | global batch size: 64 | lm loss: 6.396990E+00 | loss scale: 2048.0 | grad norm: 90543.167 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5802/ 159576 | consumed samples: 166576 | elapsed time per iteration (ms): 16555.9 | learning rate: 4.607E-05 | global batch size: 64 | lm loss: 6.358585E+00 | loss scale: 2048.0 | grad norm: 43097.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5803/ 159576 | consumed samples: 166640 | elapsed time per iteration (ms): 16465.5 | learning rate: 4.608E-05 | global batch size: 64 | lm loss: 6.493999E+00 | loss scale: 2048.0 | grad norm: 45567.331 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5804/ 159576 | consumed samples: 166704 | elapsed time per iteration (ms): 16436.4 | learning rate: 4.610E-05 | global batch size: 64 | lm loss: 6.533109E+00 | loss scale: 2048.0 | grad norm: 127288.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5805/ 159576 | consumed samples: 166768 | elapsed time per iteration (ms): 16549.3 | learning rate: 4.612E-05 | global batch size: 64 | lm loss: 6.379089E+00 | loss scale: 2048.0 | grad norm: 48002.691 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5806/ 159576 | consumed samples: 166832 | elapsed time per iteration (ms): 16407.1 | learning rate: 4.614E-05 | global batch size: 64 | lm loss: 6.365424E+00 | loss scale: 2048.0 | grad norm: 49891.608 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5807/ 159576 | consumed samples: 166896 | elapsed time per iteration (ms): 16379.2 | learning rate: 4.615E-05 | global batch size: 64 | lm loss: 6.476014E+00 | loss scale: 2048.0 | grad norm: 47532.881 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5808/ 159576 | consumed samples: 166960 | elapsed time per iteration (ms): 16753.6 | learning rate: 4.617E-05 | global batch size: 64 | lm loss: 6.354483E+00 | loss scale: 2048.0 | grad norm: 56392.704 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5809/ 159576 | consumed samples: 167024 | elapsed time per iteration (ms): 16393.4 | learning rate: 4.619E-05 | global batch size: 64 | lm loss: 6.519560E+00 | loss scale: 2048.0 | grad norm: 44344.198 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5810/ 159576 | consumed samples: 167088 | elapsed time per iteration (ms): 16492.5 | learning rate: 4.621E-05 | global batch size: 64 | lm loss: 6.408142E+00 | loss scale: 2048.0 | grad norm: 49620.831 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5811/ 159576 | consumed samples: 167152 | elapsed time per iteration (ms): 16428.1 | learning rate: 4.622E-05 | global batch size: 64 | lm loss: 6.376643E+00 | loss scale: 2048.0 | grad norm: 54930.966 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5812/ 159576 | consumed samples: 167216 | elapsed time per iteration (ms): 16603.5 | learning rate: 4.624E-05 | global batch size: 64 | lm loss: 6.446056E+00 | loss scale: 2048.0 | grad norm: 49991.934 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5813/ 159576 | consumed samples: 167280 | elapsed time per iteration (ms): 16423.7 | learning rate: 4.626E-05 | global batch size: 64 | lm loss: 6.503972E+00 | loss scale: 2048.0 | grad norm: 48324.994 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5814/ 159576 | consumed samples: 167344 | elapsed time per iteration (ms): 16392.6 | learning rate: 4.628E-05 | global batch size: 64 | lm loss: 6.483917E+00 | loss scale: 2048.0 | grad norm: 49344.656 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5815/ 159576 | consumed samples: 167408 | elapsed time per iteration (ms): 16437.6 | learning rate: 4.630E-05 | global batch size: 64 | lm loss: 6.359298E+00 | loss scale: 2048.0 | grad norm: 46826.938 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5816/ 159576 | consumed samples: 167472 | elapsed time per iteration (ms): 16791.2 | learning rate: 4.631E-05 | global batch size: 64 | lm loss: 6.477077E+00 | loss scale: 2048.0 | grad norm: 80606.642 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5817/ 159576 | consumed samples: 167536 | elapsed time per iteration (ms): 16448.9 | learning rate: 4.633E-05 | global batch size: 64 | lm loss: 6.378170E+00 | loss scale: 2048.0 | grad norm: 50159.917 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5818/ 159576 | consumed samples: 167600 | elapsed time per iteration (ms): 16473.7 | learning rate: 4.635E-05 | global batch size: 64 | lm loss: 6.336848E+00 | loss scale: 2048.0 | grad norm: 68729.538 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5819/ 159576 | consumed samples: 167664 | elapsed time per iteration (ms): 16753.1 | learning rate: 4.637E-05 | global batch size: 64 | lm loss: 6.448166E+00 | loss scale: 2048.0 | grad norm: 53348.776 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5820/ 159576 | consumed samples: 167728 | elapsed time per iteration (ms): 16453.7 | learning rate: 4.638E-05 | global batch size: 64 | lm loss: 6.433999E+00 | loss scale: 2048.0 | grad norm: 56781.530 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5821/ 159576 | consumed samples: 167792 | elapsed time per iteration (ms): 16425.7 | learning rate: 4.640E-05 | global batch size: 64 | lm loss: 6.397796E+00 | loss scale: 2048.0 | grad norm: 51600.258 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5822/ 159576 | consumed samples: 167856 | elapsed time per iteration (ms): 16451.4 | learning rate: 4.642E-05 | global batch size: 64 | lm loss: 6.353134E+00 | loss scale: 2048.0 | grad norm: 49519.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5823/ 159576 | consumed samples: 167920 | elapsed time per iteration (ms): 16634.5 | learning rate: 4.644E-05 | global batch size: 64 | lm loss: 6.402969E+00 | loss scale: 2048.0 | grad norm: 52985.835 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5824/ 159576 | consumed samples: 167984 | elapsed time per iteration (ms): 16465.1 | learning rate: 4.646E-05 | global batch size: 64 | lm loss: 6.411339E+00 | loss scale: 2048.0 | grad norm: 61452.738 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5825/ 159576 | consumed samples: 168048 | elapsed time per iteration (ms): 16470.5 | learning rate: 4.647E-05 | global batch size: 64 | lm loss: 6.406009E+00 | loss scale: 2048.0 | grad norm: 58959.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5826/ 159576 | consumed samples: 168112 | elapsed time per iteration (ms): 16992.0 | learning rate: 4.649E-05 | global batch size: 64 | lm loss: 6.401187E+00 | loss scale: 2048.0 | grad norm: 61581.299 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5827/ 159576 | consumed samples: 168176 | elapsed time per iteration (ms): 16478.9 | learning rate: 4.651E-05 | global batch size: 64 | lm loss: 6.463509E+00 | loss scale: 2048.0 | grad norm: 62591.609 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 5827 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-25 01:42:45,269] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step5827/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 5827 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 18278.37 -[exiting program after 1190.1195526679357 minutes] datetime: 2021-09-25 01:42:58 -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:43:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:44:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:45:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:46:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:47:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:48:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:49:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:50:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:51:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -ERROR:apscheduler.scheduler:Error submitting job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 01:52:20 CEST)" to executor "default" -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 979, in _process_jobs - executor.submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job - self._do_submit_job(job, run_times) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job - f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/concurrent/futures/thread.py", line 179, in submit - raise RuntimeError('cannot schedule new futures after shutdown') -RuntimeError: cannot schedule new futures after shutdown -srun: Job step aborted: Waiting up to 62 seconds for job step to finish. -Killing subprocess 48057 -Killing subprocess 31112 -Killing subprocess 48058 -Killing subprocess 48059 -Killing subprocess 48061 -Killing subprocess 31113 -Killing subprocess 31114 -Killing subprocess 31115 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 77714 -Killing subprocess 28453 -Killing subprocess 38924 -Killing subprocess 54762 -Killing subprocess 77715 -Killing subprocess 77716 -Killing subprocess 77717 -Main process received SIGTERM, exiting -Killing subprocess 22919 -Killing subprocess 28454 -Killing subprocess 28455 -Killing subprocess 28456 -Main process received SIGTERM, exiting -Killing subprocess 54763 -Killing subprocess 54764 -Killing subprocess 54765 -Killing subprocess 38925 -Killing subprocess 38926 -Killing subprocess 38927 -Killing subprocess 22920 -Main process received SIGTERM, exiting -Killing subprocess 22921 -Killing subprocess 22923 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 19021 -Killing subprocess 45366 -Killing subprocess 19022 -Killing subprocess 19023 -Killing subprocess 19024 -Main process received SIGTERM, exiting -Killing subprocess 45367 -Killing subprocess 45368 -Killing subprocess 45370 -Main process received SIGTERM, exiting -Killing subprocess 48440 -Killing subprocess 81370 -Killing subprocess 48441 -Killing subprocess 48442 -Killing subprocess 48443 -Main process received SIGTERM, exiting -Killing subprocess 81371 -Killing subprocess 81372 -Killing subprocess 81373 -Main process received SIGTERM, exiting -Killing subprocess 65810 -Killing subprocess 65811 -Killing subprocess 65812 -Killing subprocess 65813 -Main process received SIGTERM, exiting -[2021-09-25 02:08:19] PULSE: tr8-104B is waiting to be scheduled (1165978_[1-10%1] on 'gpu_p13' partition) -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninja .................................... [OKAY][OKAY] - -ninjaninja .................................... [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -ninjaninja .................................... [OKAY] -[OKAY]-------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - -op name-------------------------------------------------- ................ - installed op name.. compatible................ - --------------------------------------------------installed - .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ......fused_adam [OKAY]............. -[NO] ....... [OKAY] -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO][NO] ....... .......[OKAY] -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - -fused_lamb ............. [NO] ....... [OKAY]fused_adam -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -[OKAY] -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - - ............. [NO] ....... [OKAY] -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attn fused_lamb............ [NO]............. ....... [NO][OKAY] - ....... transformer[OKAY] -............ [NO] ....... [OKAY] -transformer transformer............ [NO]............ .......[NO] [OKAY]....... -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -stochastic_transformer . [NO] sparse_attn....... [OKAY]............ - [OKAY] -transformertransformer ........................ [NO][NO] .............. [OKAY][OKAY] - -transformer ............transformer [NO]............ .......[NO] [OKAY]....... - [NO] ....... [OKAY] -stochastic_transformer stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - - [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer stochastic_transformer. [NO] ........ [NO][OKAY] -....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. cpu_adam[OKAY] -............... --------------------------------------------------[YES] - ......op name [OKAY]................ - installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... cpu_adam[OKAY] -............... [YES] ......fused_lamb [OKAY]............. - [NO] ....... [OKAY] -fused_adam ............. [NO] ....... sparse_attn[OKAY] -............ [NO] fused_lamb....... .............[OKAY] -[NO] .......transformer [OKAY]............ - [NO] ....... [OKAY] -stochastic_transformer . [NO]sparse_attn ................... [OKAY][NO] - ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -sparse_attnop name ............................ installed[NO] ......... compatible -[OKAY]-------------------------------------------------- - -transformer ............ [NO] ....... [OKAY] -cpu_adam ...............stochastic_transformer [YES] ....... [OKAY][NO] - ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name - -op nameop name ................ op name................ ................ installed installed................ installed....installed compatiblecompatible.. -.. - -------------------------------------------------- compatiblecompatible --------------------------------------------------- --------------------------------------------------- - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam............... cpu_adam ............... ............... [YES] [YES]............... [YES] ............ [YES] ...... [OKAY] [OKAY]...... -[OKAY] - - [OKAY] -fused_adamfused_adam fused_adamfused_adam............. .......................................[NO] [NO][NO][NO]....... ....... .............. [OKAY][OKAY] -[OKAY][OKAY] - -fused_lamb - fused_lamb.............fused_lambfused_lamb ............. .............[NO] ............. [NO][NO]....... [NO] ....... [OKAY]....... -[OKAY]....... - [OKAY][OKAY] - -sparse_attn ............ [NO] .......sparse_attnsparse_attn sparse_attn........................[OKAY] -............[NO][NO] transformer [NO]....... ....... [OKAY]............[OKAY]....... - [NO] -[OKAY]transformer - transformer................... transformer ............[OKAY][NO] -[NO]................... [NO][OKAY]stochastic_transformer....... - .......[OKAY] . -stochastic_transformer[OKAY] -[NO] .stochastic_transformer....... stochastic_transformer[NO] [OKAY]........ - .[NO][OKAY] - .......[NO] [OKAY]....... - [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - -JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op nameop name ................op name................................ installed installed ................installed .. .. ..compatibleinstalledcompatible - -compatible -------------------------------------------------- ---------------------------------------------------.. - --------------------------------------------------- -compatible --------------------------------------------------- -cpu_adamcpu_adam ...............cpu_adam............... [YES]............... [YES] ...... [YES] ...... [OKAY] ......cpu_adam -[OKAY] -[OKAY]............... - [YES] ...... [OKAY]fused_adam - fused_adam............. .............fused_adam[NO] [NO].................... .......[NO][OKAY] -[OKAY]....... - [OKAY]fused_lamb - .............fused_lamb fused_adam[NO]fused_lamb............. .................................[NO] [OKAY][NO] -....... [NO] .......[OKAY] -.......[OKAY] - [OKAY] -fused_lambsparse_attn .........................sparse_attn [NO][NO] ................... sparse_attn[NO] ....... [OKAY] -...................[OKAY] [OKAY]transformer[NO] - - ................... transformer[NO][OKAY] ............ -....... [NO][OKAY] -.......transformer [OKAY]............ - stochastic_transformer[NO] ........ stochastic_transformersparse_attn [OKAY] -[NO] .................... [NO][OKAY]stochastic_transformer[NO] - ....... .......[OKAY]. - [OKAY][NO] - ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninja ninja ...................................................... ..................[OKAY][OKAY] - [OKAY] -[OKAY]-------------------------------------------------- - --------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -op nameop nameop nameop name ................................................................ installed installedinstalledinstalled .. .... .. compatiblecompatiblecompatible -compatible - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -cpu_adamcpu_adam cpu_adam ............... ............... ............... [YES] [YES] [YES] ...... ...... ...... [OKAY] [OKAY] -[OKAY] -cpu_adam - ............... [YES] ...... [OKAY] -fused_adamfused_adam .............fused_adam............. [NO].............[NO] .......[NO]....... [OKAY].......[OKAY] - -[OKAY] -fused_lambfused_lamb .............fused_lamb............. [NO].............[NO]fused_adam [NO]........................... [OKAY] ....... -[OKAY] -[OKAY] -[NO] ....... [OKAY] -fused_lambsparse_attn sparse_attn............ ............[NO]sparse_attn .......[NO]......................... [NO][NO].......[OKAY] -..............[OKAY] transformer -[OKAY] [OKAY]............transformer - [NO]............ - transformer ....... [NO] ............[OKAY]....... - [NO][OKAY] -.......stochastic_transformer [OKAY] -stochastic_transformer. [NO] .stochastic_transformer....... [NO][OKAY] -........ [OKAY][NO] - sparse_attn....... [OKAY] -............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY] [OKAY] - -[OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op nameop name op name................................................ installed installed................ installed .. installed.. .. ..compatible compatible - -compatible--------------------------------------------------compatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam...............cpu_adam cpu_adam...............[YES] ............... ...............[YES]......[YES] ......[YES][OKAY]...... - [OKAY][OKAY] -...... - [OKAY] -fused_adam fused_adam.............fused_adam [NO]............. fused_adam ....... .............[NO].............[OKAY] -[NO].......[NO] .......[OKAY]....... fused_lamb -[OKAY][OKAY] -............. - [NO]fused_lamb fused_lambfused_lamb....... ..........................[OKAY] -............. [NO] [NO] [NO] .............. .......[OKAY][OKAY] - -[OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attnsparse_attnsparse_attn transformer .................................... [NO][NO]............[NO] ..............[NO]....... [OKAY].......[OKAY] - -[OKAY] -transformertransformer[OKAY] -............transformer............ [NO]............[NO]stochastic_transformer ..............[NO] [OKAY]. [OKAY] - -.......[NO] stochastic_transformer[OKAY]....... stochastic_transformer - [OKAY]. - .stochastic_transformer[NO] [NO]........ ....... [NO] [OKAY] [OKAY] -....... - [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -ninjaninjaninjaninja ...................................................... ..................[OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja - - ---------------------------------------------------op nameop nameop name - ................................................ installedop name installed ..installed................ .. .. installedcompatible compatiblecompatible - - ---------------------------------------------------..---------------------------------------------------------------------------------------------------- - - -compatible --------------------------------------------------- -cpu_adam cpu_adam............... cpu_adam cpu_adam ...............[YES] ..................... ............... [YES] [YES][YES][OKAY] ...... -............ [OKAY][OKAY][OKAY] - - -fused_adam ............. [NO]fused_adam fused_adam....... fused_adam ..........................[OKAY] - .............[NO] [NO] [NO].......fused_lamb [OKAY]........................... - [NO][OKAY][OKAY] -fused_lamb -....... fused_lamb.............[OKAY] fused_lamb [NO] - ................................. [NO][OKAY][NO] - .............. [OKAY][OKAY] - -sparse_attn ............ [NO] .......sparse_attn [OKAY]............sparse_attn - sparse_attn[NO]............transformer ....... [NO]............ ............ [OKAY] .......[NO] - [NO] [OKAY] ....... -transformer....... transformer............[OKAY] -[NO]............[OKAY] ....... - stochastic_transformer[OKAY][NO]transformer - .................... [NO]stochastic_transformer[NO] [OKAY] -............... [NO][OKAY] [OKAY]stochastic_transformer - - ....... [OKAY]. -stochastic_transformer [NO] ........ [OKAY][NO] - ....... [OKAY] --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninjaJIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] -[OKAY] -[OKAY]-------------------------------------------------- --------------------------------------------------- - - --------------------------------------------------- -op nameop name-------------------------------------------------- op name ................ -................ installed................installed op name .... installed ................compatible compatible -.. -installed ---------------------------------------------------------------------------------------------------- -compatible -.. - compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam .............................. [YES][YES] cpu_adam cpu_adam...... ...... ...............[OKAY][OKAY]............... - -[YES] [YES]...... ......[OKAY] -[OKAY] -fused_adam ............. [NO]fused_adam .................... fused_adam[NO][OKAY] fused_adam - ....... ..........................[OKAY]fused_lamb - [NO][NO]............. fused_lamb .......[NO] .............. ............. [OKAY][OKAY][OKAY] - - -[NO] ....... [OKAY]fused_lamb -fused_lamb .......................... [NO]sparse_attn[NO] .......................... [OKAY][NO]sparse_attn[OKAY] - -................... [NO][OKAY] -....... [OKAY]transformer - ............ transformer[NO] sparse_attn...................sparse_attn [OKAY][NO]........................ - .......[NO][NO] [OKAY]stochastic_transformer.............. - [OKAY].[OKAY] -stochastic_transformer - [NO]transformer transformer. ....... ............ [NO]............ [OKAY] [NO] - [NO] ....... ..............[OKAY] -[OKAY][OKAY] - -stochastic_transformerstochastic_transformer . .[NO] .......[NO] [OKAY]....... - [OKAY] ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninja ninja .................. .................. .................................... [OKAY][OKAY] - -[OKAY][OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------op name -op name - op nameop name................ ................ ................ installedinstalled................ installed.. .. installed compatible compatible.. -.. - -------------------------------------------------- ---------------------------------------------------compatible -compatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES]cpu_adam cpu_adam..................... cpu_adam[YES]...............[OKAY] -............... ...... [YES] ......[YES] [OKAY] -[OKAY]fused_adam...... - .............[OKAY] -[NO] ....... [OKAY] -fused_lambfused_adamfused_adam fused_adam ....................................... ............. [NO] [NO][NO][NO]....... .............. .......[OKAY][OKAY] - -[OKAY][OKAY] - -fused_lambfused_lamb fused_lamb.......................... .............[NO][NO] .......[NO]....... sparse_attn[OKAY].......[OKAY] - -[OKAY]............ - [NO] ....... [OKAY] -transformer ............ [NO]sparse_attn sparse_attn ....... sparse_attn............ ............ [OKAY] [NO]............ -[NO] ..............[NO] stochastic_transformer [OKAY] [OKAY] -....... -. [OKAY]transformer[NO] transformer -................... [NO]transformer............ [OKAY] ....... ............ - [NO] [OKAY] [NO] -....... .......[OKAY] -[OKAY]stochastic_transformer - stochastic_transformer. stochastic_transformer [NO]. ........[NO] [OKAY][NO]....... - .......[OKAY] -[OKAY] --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. ......................................................[OKAY] -[OKAY] [OKAY]-------------------------------------------------- -[OKAY] - - ----------------------------------------------------------------------------------------------------- -op name-------------------------------------------------- - op name -op name................ op name ................ ................installed ................ installed.. installed ..installed compatible .. -.. compatible-------------------------------------------------- compatible - -compatible --------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES] cpu_adam......cpu_adamcpu_adam ...............[OKAY]............... ............... -[YES] [YES][YES]...... ............[OKAY] -[OKAY][OKAY]fused_adam - - ............. [NO] ....... [OKAY] -fused_adamfused_lamb .......................... fused_adam[NO][NO]fused_adam ........................... ............. [NO] [OKAY] [NO] [OKAY] -.............. - [OKAY][OKAY] - -fused_lamb ............. fused_lambfused_lamb[NO] .............sparse_attn.................... [NO] ............[OKAY] [NO]....... -[NO] [OKAY].............. - [OKAY][OKAY] - -transformer sparse_attn............ ............[NO] [NO]....... .......[OKAY] sparse_attn[OKAY] - -sparse_attn............ transformerstochastic_transformer............[NO] ...................[NO]. [NO].......[NO][OKAY] .......[OKAY]....... - - [OKAY][OKAY] - -transformertransformer ........................stochastic_transformer [NO] [NO]....... . ....... [OKAY] [NO] -[OKAY] -....... [OKAY] -stochastic_transformer stochastic_transformer .. [NO][NO] ....... .......[OKAY] -[OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop name op name ................ ................................ installed................installed installed installed.. .. .. compatible.. compatible - compatible -compatible-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam............... cpu_adam ...............[YES] ............... ............... ...... [YES][YES] [YES]...... [OKAY] ............[OKAY] - -[OKAY][OKAY] - -fused_adam fused_adamfused_adam.............fused_adam ..........................[NO]............. [NO] ....... [NO][NO] ....... [OKAY][OKAY].............. - - [OKAY][OKAY] -fused_lamb -fused_lamb ..........................fused_lambfused_lamb [NO][NO].......................... [NO]....... .......[NO] ....... [OKAY] [OKAY]....... - -[OKAY] [OKAY] - -sparse_attn sparse_attn............ sparse_attnsparse_attn ............ [NO]............ [NO]...................[NO] [OKAY].......[NO]....... - [OKAY][OKAY]....... -transformer - transformer[OKAY]transformer............ - ........................[NO] transformer[NO] [NO] .......................... .......[NO][OKAY] [OKAY] -[OKAY] -....... - [OKAY]stochastic_transformer -stochastic_transformer stochastic_transformer ..stochastic_transformer . [NO][NO] . [NO][NO].............. ..............[OKAY][OKAY] - -[OKAY][OKAY] - -ninjaninjaninjaninja ...................................................... [OKAY] [OKAY].................. -[OKAY] - ---------------------------------------------------------------------------------------------------- -[OKAY] - --------------------------------------------------- -op name -op name--------------------------------------------------op name -................ ................................op name installedinstalled................installed ....installed.. compatiblecompatiblecompatible -.. - --------------------------------------------------- --------------------------------------------------compatible - --------------------------------------------------- - --------------------------------------------------- -cpu_adam cpu_adam............... cpu_adam ............... [YES] cpu_adam[YES]............... ...... ...... ............... [OKAY] -[YES][OKAY][YES] - ...... ......[OKAY] -[OKAY] -fused_adam fused_adam............. .............[NO] [NO]fused_adamfused_adam....... ....................[OKAY]............. - [OKAY][NO] - [NO]fused_lamb....... ....................fused_lamb [OKAY][NO] - ....................fused_lamb[OKAY] -[OKAY].............[NO] - fused_lamb[NO]....... .............[OKAY]....... - [NO][OKAY] -....... sparse_attn[OKAY] -............ [NO] .......sparse_attn [OKAY]............ - [NO]sparse_attn transformer ....... ........................[OKAY] sparse_attn [NO] -[NO] .......transformer....... ........................ [OKAY][OKAY][NO][NO] - - transformer.............. stochastic_transformer............[OKAY][OKAY] - -.[NO] stochastic_transformertransformer[NO]....... ....................[OKAY] - [OKAY][NO] -[NO] stochastic_transformer.............. [OKAY][OKAY]. - - [NO] ....... stochastic_transformer[OKAY] -. [NO] ....... [OKAY] --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - -ninjaninjaninjaninja .................. .................. .................. .................. [OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name --------------------------------------------------................op name - installedop name................ ................ .. ................ installedinstalledcompatible -..installed.. -------------------------------------------------- compatible -compatible ---------------------------------------------------.. - - --------------------------------------------------compatible - --------------------------------------------------- -cpu_adam cpu_adamcpu_adam............... cpu_adam...............[YES] ............... ............... [YES]......[YES] [YES]............[OKAY] - ......[OKAY][OKAY] - -[OKAY] -fused_adam ............. [NO] fused_adam.......fused_adamfused_adam .............[OKAY].......................... -[NO] [NO] [NO]fused_lamb ....... .............[OKAY].............. - [NO][OKAY] [OKAY] -fused_lamb - .................... fused_lamb [OKAY]fused_lamb [NO] - ............. ............. ....... [NO] [NO] [OKAY] .............. - [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn sparse_attntransformer............sparse_attn ............[NO]............ ............[NO] .......[NO]....... [NO] [OKAY] ....... -[OKAY]....... - [OKAY][OKAY] - -transformer transformerstochastic_transformer ............transformer ............ [NO][NO] .................... [NO].......[OKAY][NO] - ....... .......[OKAY][OKAY] -[OKAY] -stochastic_transformer - .stochastic_transformerstochastic_transformer [NO] ......... [NO][NO][OKAY] -.............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -JIT compiled ops requires ninja-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -op name -op name op nameop name ................ ................................................installed installedinstalled.. installed ....compatible.. -compatible -compatible-------------------------------------------------- -compatible---------------------------------------------------------------------------------------------------- - - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam ............... cpu_adam ...............[YES]............... .....................[YES][YES] [OKAY]......[YES] ...... -...... [OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] ....... [OKAY]fused_adam -fused_adamfused_adam fused_lamb....................................... ............. [NO] [NO][NO] [NO] ............................ [OKAY] [OKAY][OKAY][OKAY] - - - -fused_lambfused_lamb fused_lamb.......................... .............[NO][NO] [NO].............. .......[OKAY][OKAY]sparse_attn - - [OKAY]............ - [NO] ....... [OKAY] -transformersparse_attn sparse_attn ............ ............sparse_attn ............ [NO] [NO][NO]................... ....... ....... [NO][OKAY][OKAY] - - .......[OKAY] -[OKAY]stochastic_transformer -transformer .............transformer [NO] [NO]transformer................... ...................[NO][OKAY] - .......[NO][OKAY] stochastic_transformer - [OKAY]....... - [OKAY]. - stochastic_transformer[NO] .......stochastic_transformer . [OKAY][NO] - ........ [NO][OKAY] -....... [OKAY] --------------------------------------------------- -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................ ................................ ................installedinstalled installedinstalled.. .. .. .. compatiblecompatible compatible - -compatible - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adamcpu_adamcpu_adamcpu_adam ............................................. ...............[YES] [YES][YES][YES]...... ..................[OKAY] -[OKAY][OKAY][OKAY] - - -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. fused_adam[NO]fused_adam fused_adam ............. ............. .......[NO] .............[NO] ....... [OKAY] [NO]....... -[OKAY] -utils .................. [YES] ...... [OKAY] -.......[OKAY] fused_lamb -[OKAY] fused_lamb -quantizer .............. [NO] ....... [OKAY] -............. fused_lamb ............. fused_lamb.............[NO] [NO][NO].................... ....... .......[OKAY][NO] --------------------------------------------------- - [OKAY][OKAY]....... - - [OKAY] -sparse_attn ............ sparse_attn[NO] sparse_attn.......sparse_attn............ ............[NO][OKAY] ................... - [NO][NO]transformer[OKAY] -.......................... transformer[OKAY][OKAY][NO] - .......transformer -............ [OKAY]............[NO]transformer - [NO]................... ....... stochastic_transformer [OKAY] [NO][OKAY] - -........ stochastic_transformer[OKAY][NO] stochastic_transformer - ........ .[NO][OKAY] stochastic_transformer -[NO] ....... ........[OKAY] -[NO][OKAY] -....... [OKAY] -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY] [OKAY] - - ---------------------------------------------------[OKAY]-------------------------------------------------- --------------------------------------------------- - - -op nameop name ................--------------------------------------------------op name -................installed................ op nameinstalled.. installed ................ .. .. compatibleinstalled compatiblecompatible - - -..---------------------------------------------------------------------------------------------------- -------------------------------------------------- - -compatible - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam cpu_adam............... ............... .............................. [YES] [YES][YES][YES]...... ............ ......[OKAY][OKAY] - [OKAY] -[OKAY] - -fused_adamfused_adam fused_adamfused_adam............. .............[NO] .............[NO].................... [OKAY][NO]....... [NO] ....... -[OKAY]....... - [OKAY]fused_lamb[OKAY] - -.............fused_lamb fused_lamb[NO] fused_lamb ................................. .............[OKAY][NO][NO] - [NO].............. .......[OKAY][OKAY] - -[OKAY] -sparse_attn ............ [NO] sparse_attnsparse_attn.......sparse_attn ........................[OKAY]............ -[NO][NO][NO] .....................transformer [OKAY][OKAY][OKAY] -............ - - transformer[NO]transformertransformer ........................................... [NO] [NO][OKAY][NO] - ....... ....... ....... [OKAY] stochastic_transformer[OKAY] -[OKAY] - -. stochastic_transformer[NO]stochastic_transformerstochastic_transformer ....... ..[OKAY]. - [NO][NO] [NO]....... ....... ....... [OKAY] [OKAY] -[OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninja--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -transformer_inference .. [NO] ....... [OKAY] - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................. ...................................................... [OKAY][OKAY] [OKAY] - -[OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -async_io ............... [NO] ....... [NO] -op name op name ................op name ................ ................ ................installedinstalled installed ..installed .. .. .. compatible compatiblecompatible -compatible - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -cpu_adam cpu_adamcpu_adam............... cpu_adam ............... ..............................[YES] [YES] [YES]......[YES] ............[OKAY] ...... [OKAY] -[OKAY] -quantizer .............. [NO] ....... [OKAY] -[OKAY] - --------------------------------------------------- -fused_adamfused_adam .............fused_adam............. fused_adam [NO].............[NO] .......[NO] .................... [OKAY]....... - [NO][OKAY][OKAY] -fused_lamb -....... fused_lamb.............[OKAY] fused_lamb -.............[NO] [NO]....... .............fused_lamb....... [NO] [OKAY] [OKAY]............. - -....... [NO][OKAY] -....... [OKAY] -sparse_attnsparse_attn ........................sparse_attn sparse_attn[NO][NO]............ ....... ................... [OKAY][NO][OKAY] -[NO] -transformer.............. ............ transformer[OKAY] [OKAY] - -[NO]............ .......[NO]transformertransformer [OKAY]............................... - [NO][OKAY] [NO]stochastic_transformer -....... .......stochastic_transformer.[OKAY] -[NO][OKAY]. -....... [NO]stochastic_transformer[OKAY]stochastic_transformer ....... - .[OKAY]. - [NO][NO] .............. [OKAY][OKAY] - -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name --------------------------------------------------- op name -op name................ op name................ ................installed installedinstalled.................. ..compatibleinstalled.. - compatiblecompatible--------------------------------------------------.. - - - ----------------------------------------------------------------------------------------------------compatible - - --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adam [YES]cpu_adam ............... .....................[YES]............... [YES] [YES] [OKAY]...... -............ [OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] ....... fused_adam[OKAY] fused_adam -.............fused_adam [NO]............. fused_lamb....... ............. [NO] .............[OKAY] [NO] -....... [NO] ....... fused_lamb .......[OKAY] [OKAY] -............. -[OKAY] -[NO] .......fused_lamb fused_lamb[OKAY] -.......................... [NO][NO] .............. [OKAY][OKAY]sparse_attn - - ............ [NO] .......sparse_attn [OKAY]............ - [NO] ....... sparse_attnsparse_attntransformer[OKAY] -.................................... [NO] transformer[NO] [NO]................... ....... .......[NO] .......[OKAY][OKAY] - -[OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformerstochastic_transformer transformer............stochastic_transformer. ............ [NO]. [NO] [NO] [NO]....... ....... ..............[OKAY][OKAY] - -[OKAY][OKAY] - -async_io ............... [NO] ....... [NO] -stochastic_transformer stochastic_transformer. [NO]. .......[NO] [OKAY]....... - [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op name op name op name ................................................ installed ................installed installed installed ...... ..compatiblecompatiblecompatible - - ---------------------------------------------------compatible-------------------------------------------------- --------------------------------------------------- - - --------------------------------------------------- -cpu_adam cpu_adam...............cpu_adam ...............cpu_adam............... [YES] [YES] ............... [YES]...... ...... [YES]......[OKAY][OKAY] - ...... -[OKAY] -[OKAY] -fused_adam ............. fused_adam[NO] fused_adam............. fused_adam ....... .............[NO] ............. .......[OKAY][NO][NO] - [OKAY].......fused_lamb....... - [OKAY][OKAY].............fused_lamb - - [NO].............fused_lambfused_lamb [NO].................... .............[NO] .......[NO][OKAY] ....... - [OKAY] ....... -[OKAY] -[OKAY] -sparse_attn sparse_attn............sparse_attn sparse_attn........................[NO] ............[NO] [NO] .......[NO] ....... ....... .......[OKAY] -[OKAY] [OKAY] -[OKAY] - -transformer ............transformertransformer transformer ........................[NO]............ [NO] [NO] ....... [NO]....... ....... [OKAY]....... [OKAY] - -[OKAY][OKAY] - -stochastic_transformer stochastic_transformerstochastic_transformer .stochastic_transformer . [NO] . [NO]. ....... [NO][NO][OKAY]....... - ..............[OKAY] -[OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja -JIT compiled ops requires ninja - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - - -JIT compiled ops requires ninja-------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - --------------------------------------------------- - ----------------------------------------------------------------------------------------------------- ---------------------------------------------------op name - - op name................ op name................op name installed installed.................. ................ .. compatibleinstalled -installed compatible..-------------------------------------------------- - -..--------------------------------------------------compatible - -compatible --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adam .............................. [YES]cpu_adam[YES] cpu_adam ........................... [OKAY]...............[OKAY] - - [YES][YES] ............ [OKAY][OKAY] - -fused_adam fused_adam............. .............[NO] [NO]....... fused_adamfused_adam.......[OKAY] [OKAY]............. - - .............[NO] [NO]fused_lambfused_lamb....... ............. ............. [OKAY]....... [NO] - [NO][OKAY]....... - fused_lamb.......[OKAY] -[OKAY]fused_lamb............. - .............[NO] [NO]....... .......[OKAY] -[OKAY] -sparse_attn sparse_attn............ ............[NO] [NO]....... sparse_attn.......sparse_attn[OKAY] -............[OKAY]............ - transformer[NO][NO] transformer................... ................... [NO][OKAY] [OKAY] -[NO] -....... transformer transformer [OKAY]....... -............ ............[OKAY][NO] -stochastic_transformer [NO] ....... stochastic_transformer ........ [OKAY] .[NO] - [OKAY][NO] - .......stochastic_transformer ....... [OKAY] -[OKAY]stochastic_transformer. - [NO] ........ [NO][OKAY] -....... [OKAY] --------------------------------------------------- -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -------------------------------------------------------------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- - -op nameop nameop name op name................ ................ ................ ................installed installedinstalled installed .... .. ..compatiblecompatible - -compatiblecompatible---------------------------------------------------------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adamcpu_adam[YES] ............... .............................. ...... [YES][YES][OKAY] [YES]...... - ...... [OKAY] ...... -[OKAY] -[OKAY] -fused_adam ............. [NO]fused_adam fused_adam.................... fused_adam .............[NO][OKAY] - .............[NO]....... [NO] fused_lamb.......[OKAY] -............. ....... [OKAY] [NO] -fused_lamb [OKAY]fused_lamb ............. - ....... [NO].............[OKAY] -.......fused_lamb[NO] [OKAY].................... - [NO][OKAY] -....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attnsparse_attntransformer sparse_attn.................................... [NO][NO]............ [NO] .......[NO]....... .......[OKAY][OKAY]....... - - [OKAY][OKAY] - -transformerstochastic_transformer transformer transformer............ .........................[NO] [NO] [NO] .......[NO] ....... [OKAY].............. - [OKAY][OKAY] -[OKAY] - -stochastic_transformer stochastic_transformer stochastic_transformer. ..[NO] [NO][NO]....... ..............[OKAY] -[OKAY][OKAY] - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... ....................................[OKAY][OKAY] - -[OKAY]--------------------------------------------------[OKAY]-------------------------------------------------- - - - -op name --------------------------------------------------................ ---------------------------------------------------op name op nameinstalled - .................................. op name installedcompatible installed -..................-------------------------------------------------- -installed..compatible -..compatible -------------------------------------------------- -compatible - ---------------------------------------------------cpu_adam --------------------------------------------------- -...............cpu_adam [YES]............... ......[YES] cpu_adam [OKAY]cpu_adam ...... --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - ...............[OKAY]............... - [YES][YES] ............ [OKAY][OKAY] - - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - - -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninja -fused_adam ............. [NO] fused_adam....... .............[OKAY] -[NO]fused_adamfused_adam fused_lamb ....... .......................... ............. [OKAY][NO][NO][NO] - ..............fused_lamb ....... [OKAY] -[OKAY] -.............[OKAY] [NO] -fused_lamb .................... fused_lamb[OKAY] - .............[NO]sparse_attn [NO]................... .......[NO][OKAY] .......[OKAY] - -[OKAY]sparse_attn - ............ [NO] transformer....... ............[OKAY] -[NO] sparse_attn.......sparse_attn transformer[OKAY] ............ -........................ [NO][NO]stochastic_transformer[NO] ............... ....... [OKAY][OKAY][OKAY][NO] - - -.......stochastic_transformer transformer [OKAY] transformer -............. ............[NO][NO] [NO].............. .......[OKAY][OKAY] - -[OKAY] -stochastic_transformer stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -op name -op name op nameop name................ ................ ................ ................installed installed ..installed installed ..compatible -....--------------------------------------------------compatible - compatible -compatible-------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES] cpu_adam...... cpu_adam............... cpu_adam[OKAY] -[YES].............................. ......[YES][YES] ......[OKAY]...... -fused_adam [OKAY][OKAY]............. - - [NO] ....... [OKAY] -fused_adam .............fused_lamb [NO]fused_adam............. fused_adam .......[NO] ............. .................... [OKAY] [NO] [NO] -[OKAY] -..............fused_lamb [OKAY][OKAY]............. - - [NO] fused_lamb.......fused_lambsparse_attn .............[OKAY]......................... - [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] - - -sparse_attn ............transformer [NO]............ .......[NO] [OKAY]....... - [OKAY] -sparse_attnsparse_attn transformerstochastic_transformer............ ........................ [NO] . [NO][NO] .............. [NO] .......[OKAY] [OKAY] -[OKAY]....... - - [OKAY]stochastic_transformer -transformer transformer ......................... [NO][NO][NO] ....... [OKAY] -ninjaninjaninjaninja .................. ....................................[OKAY].................. - .............. stochastic_transformer[OKAY][OKAY] - - [OKAY][OKAY]--------------------------------------------------[OKAY] - - - ---------------------------------------------------op name-------------------------------------------------- --------------------------------------------------- op name - -. [NO]stochastic_transformer ....... .[OKAY] -[NO] ....... [OKAY] -................ ................op nameop nameinstalled ................ installed .................. installed .. compatible installed.. -compatible - --------------------------------------------------compatible--------------------------------------------------.. - - - --------------------------------------------------compatible - --------------------------------------------------- -cpu_adamcpu_adam .............................. cpu_adam[YES][YES] cpu_adam ..................... ...... ............... [OKAY][YES] [OKAY] - -......[YES] [OKAY]...... - [OKAY] -fused_adamfused_adam .......................... [NO][NO] fused_adam.............. fused_adam............. [OKAY][OKAY].............[NO] - - .......[NO]fused_lamb [OKAY]fused_lamb.................... - [NO]............. fused_lamb[OKAY].......[NO] [OKAY]............. - -....... [NO][OKAY] -.......fused_lamb [OKAY] -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - -............. [NO] ....... [OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op nameop name -sparse_attn ............sparse_attn [NO]............ .......[NO] sparse_attn [OKAY] ....... -............ [OKAY][NO]transformer -op name ................................ op name................ installed installed..installed ................ .. compatible.. installed -compatible -------------------------------------------------- - sparse_attn...................transformer [OKAY]............[NO]............ - .......[NO] [NO] transformer[OKAY] ....... -..compatible --------------------------------------------------- - -compatible-------------------------------------------------- - --------------------------------------------------- -....... ............ [OKAY] stochastic_transformer -[OKAY][NO] -cpu_adam cpu_adam............... ...............cpu_adam[YES] [YES]......cpu_adam............... [OKAY] ......[YES] - ........ stochastic_transformer transformer [NO][OKAY] -.................... stochastic_transformer [NO][NO] [OKAY] ....... -............... [OKAY] ...... - .[OKAY] ....... -[NO] [OKAY]....... -[OKAY] -[YES] [OKAY] -...... [OKAY]fused_adam - ............. [NO]fused_adam .................... fused_adam [OKAY] [NO] -stochastic_transformer . [NO] ....... [OKAY] -............. .......[NO]fused_lamb fused_adam [OKAY]............. -....... [NO].............[OKAY] fused_lamb....... - .............[NO]fused_lamb[OKAY] -[NO] .................... .......[NO][OKAY] [OKAY] -....... - [OKAY] -sparse_attnfused_lamb ......................... [NO] [NO]....... .......sparse_attnsparse_attn[OKAY] -........................[OKAY] [NO][NO] - transformer ....... ....... ............ [OKAY][OKAY][NO] - - ....... transformer[OKAY] transformer -............ sparse_attn............[NO]stochastic_transformer .......[NO] ............[OKAY]. - .......[NO] [NO] [OKAY]stochastic_transformer ....... - ........[OKAY] -stochastic_transformer[OKAY][NO] -........ [NO][OKAY]transformer -................... [OKAY] -[NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -JIT compiled ops requires ninja - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- ---------------------------------------------------op name - -op name op nameop name ................ ................................installed ................ installed installedinstalled.. ..compatible -....compatible-------------------------------------------------- - -compatiblecompatible --------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES]cpu_adam ...............cpu_adam...... cpu_adam ............... [YES][OKAY] ............... -[YES] ...... [YES] ...... [OKAY] ...... -[OKAY] -[OKAY] -fused_adam ............. [NO] ....... [OKAY]fused_adamfused_adam -fused_adam .............fused_lamb.......................... [NO].............[NO] [NO] ....... .......[NO]....... [OKAY].......[OKAY][OKAY] - - -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_lamb fused_lambfused_lamb............. .............[NO]............. [NO] ....... [NO] ....... [OKAY]sparse_attn - .......[OKAY]............ -[OKAY][NO] -async_io ............... [NO] ....... [NO] - ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -transformersparse_attn ........................ sparse_attn [NO][NO]sparse_attn ...................................... [OKAY][OKAY][NO][NO] - - .............. transformer[OKAY][OKAY]stochastic_transformer - -utils .................. [YES] ...... [OKAY] -............. transformer[NO] transformer[NO]................... ................... [OKAY][NO] - [OKAY] [NO] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -.......stochastic_transformer .......[OKAY]. - [OKAY][NO] -stochastic_transformer....... [OKAY]stochastic_transformer. - [NO] ........ [NO][OKAY] -....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name -op name op name op name ................................ ................ ................ installedinstalled installed..installed.. compatible.... -compatible ---------------------------------------------------compatiblecompatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adamcpu_adam[YES] cpu_adam............... ............... ......[YES] ............... [OKAY][YES] -...... [YES]...... [OKAY]......[OKAY] - -[OKAY] -fused_adam ............. [NO] fused_adamfused_adam.......fused_adam .............[OKAY].......................... - [NO][NO][NO]fused_lamb .................................. [NO][OKAY][OKAY] - -[OKAY]....... - fused_lamb[OKAY]fused_lambfused_lamb - ....................................... [NO] [NO] [NO] ....... ....... ....... [OKAY] [OKAY] -[OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attnsparse_attn transformer sparse_attn............ ........................ ............[NO] [NO] [NO][NO]....... .....................[OKAY] -[OKAY][OKAY] - -[OKAY]transformer - transformer............ ............[NO] stochastic_transformertransformer [NO] ....... ............ ........ [OKAY] -[NO][NO][OKAY] -.............. stochastic_transformer [OKAY] [OKAY] -stochastic_transformer -. [NO]. stochastic_transformer....... [NO][OKAY] ........ - [NO][OKAY] -....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - -transformer_inference .. [NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- -op name - - op name................op name op name ................ installed ................................installed .. installed installed ..compatible .. compatible -..-------------------------------------------------- - -compatible--------------------------------------------------compatible - - --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adam ...............cpu_adamcpu_adam............... [YES] .............................. [YES] ......[YES]......[YES] [OKAY] -[OKAY]............ - [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adam .............fused_adam [NO].............fused_adamfused_adam [NO].................... [NO]............. .......[OKAY][NO]....... - [OKAY].......[OKAY] -fused_lamb -async_io ............... [NO] ....... [NO] - fused_lamb[OKAY].............fused_lamb - .............[NO]............. [NO] fused_lamb .......[NO]....... [OKAY].............[OKAY]....... - - [NO][OKAY] -....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -sparse_attn sparse_attn............ sparse_attn............[NO] [NO]...................sparse_attn .......[NO][OKAY]............ [NO] -[OKAY]....... -quantizer .............. [NO] ....... [OKAY] -....... transformer [OKAY] [OKAY]............ - --------------------------------------------------- -transformer [NO]............ transformertransformer ....... [NO] ............ ............[OKAY] ....... -[NO] [NO] [OKAY]stochastic_transformer - ....... ........[OKAY] stochastic_transformer -[OKAY] [NO] - ........ stochastic_transformer[OKAY][NO]stochastic_transformer - ......... [OKAY][NO] -[NO] .............. [OKAY] -[OKAY] -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -op name................op name op name................installed................ ..installed................installed ..compatibleinstalled.. - compatible..compatible-------------------------------------------------- - - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -compatible-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -cpu_adam ............... [YES] ...... cpu_adamcpu_adam[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -cpu_adam ............... ............... ............... [YES] [YES] [YES] ...... ...... ......fused_adam [OKAY] [OKAY] -............. -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - -[NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -fused_lamb fused_adam.............fused_adam fused_adam[NO]............. .................................[NO] .......[NO][OKAY][NO] - ..............[OKAY] -[OKAY][OKAY] - -fused_lamb ............. fused_lambfused_lamb[NO] sparse_attn ............. ................................ [NO][NO][OKAY][NO] ....... - ..............[OKAY] -[OKAY][OKAY] - -transformer ............ sparse_attn[NO] ....... ............[OKAY] sparse_attnsparse_attn -[NO] ............stochastic_transformer....... ............[NO] . [NO] [OKAY] [NO]....... ....... - ....... [OKAY][OKAY]transformer[OKAY] - - -............transformer transformer............[NO] ............[NO] ....... [NO] ....... [OKAY].......[OKAY] - - [OKAY] -stochastic_transformer stochastic_transformerstochastic_transformer. [NO]. ........[NO] [NO].......[OKAY] -.......[OKAY] -[OKAY] -ninjaninjaninja ninja .................. .................................... .................. [OKAY][OKAY] [OKAY] - -[OKAY] ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op name--------------------------------------------------op name - - op name................op name................ installed................ ................ installedinstalled .. .. installed..compatible compatible - .. ---------------------------------------------------compatible -------------------------------------------------- - -compatible - --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES]cpu_adamcpu_adam cpu_adam ...... ............... .............................. [OKAY][YES] [YES] -[YES]............ [OKAY]......[OKAY] - -[OKAY] -fused_adam ............. fused_adam[NO] fused_adamfused_adam.................... ..........................[OKAY] [NO] -[NO][NO]....... ..............[OKAY]fused_lamb - [OKAY][OKAY]............. - - fused_lamb[NO] fused_lamb ............. ....... ............. fused_lamb[OKAY][NO] -[NO] ............. ....... ....... [NO] [OKAY] [OKAY] -....... - [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn transformersparse_attn ............sparse_attn ............ [NO]............ ............ [NO]....... [NO] [NO][OKAY] ....... - ..............[OKAY] -[OKAY][OKAY]stochastic_transformer - -transformer ............transformer. [NO]transformer [NO]........................ ....... .......[NO] [NO] [OKAY][OKAY] ....... - - .......[OKAY] - [OKAY]stochastic_transformer - stochastic_transformer . [NO]stochastic_transformer. .......[NO]. [OKAY][NO]....... - .......[OKAY] -[OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - - --------------------------------------------------- -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninja-------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - - -JIT compiled ops requires ninja--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................. ....................................[OKAY] [OKAY] -[OKAY][OKAY] --------------------------------------------------- - - -------------------------------------------------------------------------------------------------------------------------------------------------------op name - - - op nameop name ................op name................ installed................................ installed..installed installed .. compatible.. .. - compatible --------------------------------------------------compatible - -compatible - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... [YES]cpu_adamcpu_adam cpu_adam............... ...... ............... ...............[YES] [OKAY] [YES] ...... -[YES]...... [OKAY]......[OKAY] - -[OKAY] -fused_adam ............. [NO] ....... fused_adam[OKAY]fused_adam -fused_adam............. .............[NO]fused_lamb............. .......[NO].............[NO] [NO].......[OKAY] ....... -....... fused_lamb[OKAY][OKAY][OKAY] - - -............. [NO]fused_lambfused_lamb ....... ............. ............. [OKAY] -[NO][NO]sparse_attn .......................... [OKAY][OKAY][NO] - -.......sparse_attn [OKAY]............ - [NO] transformer....... ............[OKAY] -sparse_attn[NO]sparse_attn transformer ........................................... [OKAY] [NO] -[NO][NO] ..................... stochastic_transformer [OKAY] [OKAY][OKAY] - - -. [NO]transformer stochastic_transformertransformer ................................ [OKAY][NO][NO] - [NO]....... ..............[OKAY] - [OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop name op name ................................................................ installedinstalledinstalled installed ........ compatiblecompatiblecompatible -compatible - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -cpu_adam ............... cpu_adamcpu_adam[YES]cpu_adam .................................... [OKAY]............... - [YES][YES][YES] .................. [OKAY][OKAY]fused_adam[OKAY] - - -............. [NO] ....... [OKAY] -fused_adamfused_lamb fused_adam fused_adam.......................... [NO] ................................. [NO] [NO][NO][OKAY] ....... -....... ....... [OKAY] [OKAY] -[OKAY] - -fused_lambfused_lambfused_lamb .......................... sparse_attn............. [NO] [NO][NO] ............ .............. .......[NO][OKAY] - [OKAY][OKAY]....... - - [OKAY] -transformer ............ [NO]sparse_attn ................... sparse_attn[NO] sparse_attn[OKAY] ....... ............ -............ [OKAY] [NO] -[NO]stochastic_transformer transformer .......................... . [NO][OKAY][OKAY][NO] ....... - -....... [OKAY]transformer[OKAY] -transformer - ........................ stochastic_transformer [NO] [NO] ............... [NO][OKAY] [OKAY] -....... - [OKAY] -stochastic_transformer stochastic_transformer . . [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op nameop name - op name ................ ................op name ................ installed................ installed ..installed installed .. compatible ....compatible - -compatible--------------------------------------------------compatible - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] cpu_adam cpu_adam ..................... ............... ............... [OKAY][YES] [YES] -[YES] ...... ...... ...... [OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] ....... [OKAY]fused_adam -fused_adam fused_adam ............. fused_lamb............. ............. .............[NO][NO][NO] .............. [NO] .......[OKAY].......[OKAY] - -[OKAY][OKAY] - -fused_lamb fused_lamb.............fused_lamb [NO].......................... .......[NO][NO] [OKAY].......sparse_attn....... - [OKAY]............[OKAY] - -[NO] ....... [OKAY] -transformer ............ sparse_attn[NO] sparse_attn sparse_attn............ ....... ........................[NO][OKAY] -[NO].......[NO] [OKAY].......stochastic_transformer....... - [OKAY][OKAY]transformer -. - ............transformer[NO] transformer ................... [NO]............ [NO] [OKAY]....... [NO] - ....... [OKAY] ....... -[OKAY] -[OKAY] -stochastic_transformer .stochastic_transformerstochastic_transformer [NO] ......... [NO][NO][OKAY] -.............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatibleninja - ----------------------------------------------------------------------------------------------------- - -..................ninja [OKAY] -.................. [OKAY]cpu_adam-------------------------------------------------- - cpu_adam -............... -------------------------------------------------- op name............... - [YES] ................ op name[YES] ...... installed ......................[OKAY] -[OKAY]..installed - ..compatible -compatible --------------------------------------------------- ---------------------------------------------------fused_adam - ............. fused_adam[NO] .................... [OKAY][NO] - cpu_adam....... ...............cpu_adam[OKAY]fused_lamb [YES] - ............................ ...... [YES]fused_lamb[NO] [OKAY]............. - .............[OKAY][OKAY] - -[NO] ....... [OKAY] -fused_adam ............. [NO] .......fused_adam [OKAY].............sparse_attn - [NO]............sparse_attn fused_lamb ....... ............[NO] ............. [OKAY] .......[NO] -[NO] [OKAY]..............fused_lamb - [OKAY].............[OKAY]transformer - -[NO] ...................transformer [NO][OKAY]............ - .......[NO] [OKAY].......sparse_attn - [OKAY]............ - [NO]stochastic_transformer .......stochastic_transformer . [OKAY]sparse_attn - .[NO]............ [NO]transformer....... [NO]...................[OKAY] - .......[OKAY] [NO] -[OKAY] -....... [OKAY]transformer - ............ [NO] stochastic_transformer....... [OKAY] -. [NO]stochastic_transformer ....... [OKAY]. - [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -async_io ............... [NO] ....... [NO] - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................. ...................................................... [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop nameop name ................................................................ installedinstalledinstalledinstalled ...... .. compatiblecompatiblecompatible - - -compatible---------------------------------------------------------------------------------------------------- --------------------------------------------------- - - --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at -cpu_adamcpu_adam cpu_adam ............... cpu_adam............... ............... [YES] [YES] .....................[YES] ......[YES]...... [OKAY] - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -[OKAY][OKAY] - -...... [OKAY] -fused_adam ............. [NO]fused_adam .......fused_adam............. [OKAY].............[NO]fused_adam - [NO]fused_lamb.................... .................... [OKAY] [OKAY][NO] - - [NO]....... fused_lamb[OKAY].......fused_lamb - .............[OKAY]............. - [NO]fused_lamb[NO] ........................... [NO][OKAY][OKAY] - - sparse_attn....... [OKAY]............ [NO] --------------------------------------------------- -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -....... sparse_attnsparse_attn[OKAY] - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja-------------------------------------------------- - - - -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at -........................ [NO]transformer[NO] .......................... [OKAY][OKAY][NO] -sparse_attn - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - ....... transformer[OKAY]............transformer - ............ [NO] stochastic_transformer[NO] ............ .............. . [NO] [OKAY] [OKAY][NO]....... - .......[OKAY] - -[OKAY] -stochastic_transformer transformer .stochastic_transformer ............ [NO] [NO]........ ....... [OKAY] [NO] - ....... [OKAY][OKAY] - -stochastic_transformer . [NO] ....... [OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................. .................. .................................... [OKAY] [OKAY] -[OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------op name - - op name................ op nameop name ................ installed................................installed .. installed ..installedcompatible -compatible -....-------------------------------------------------- -------------------------------------------------- -compatible -compatible - --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adam .............................. cpu_adam[YES] cpu_adam[YES]............... ..................... ...... [YES][OKAY] [YES] - ......[OKAY] -......[OKAY] - [OKAY] -fused_adam ............. [NO]fused_adam .................... fused_adamfused_adam [OKAY] [NO] ............. -.................... [NO][NO][OKAY]fused_lamb - ........................... fused_lamb[NO] [OKAY][OKAY] - - .................... [NO]fused_lambfused_lamb [OKAY] .................... -............. [OKAY] [NO] -[NO] .............. [OKAY][OKAY] - -sparse_attn ............sparse_attn ............[NO]sparse_attnsparse_attn [NO]................... ....... [OKAY]............[OKAY] -[NO] - [NO]transformer transformer....... ....... ............ ............[OKAY] [OKAY][NO] - -[NO] .......transformer ....... transformer [OKAY]............ - [OKAY] ............ -[NO]stochastic_transformer [NO].......stochastic_transformer ......... [NO][OKAY] [OKAY] - [NO] -....... .......stochastic_transformer[OKAY] -stochastic_transformer[OKAY] - . [NO]. .......[NO] [OKAY]....... - [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] -[OKAY] -[OKAY]-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------op name-------------------------------------------------- -op name -................ op name ................ op nameinstalled................installed ..................installed.. compatibleinstalledcompatible -.. - ..----------------------------------------------------------------------------------------------------compatible - - -compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adam cpu_adam............... ...............[YES]cpu_adam cpu_adam [YES]...... .....................[OKAY]............... - [OKAY][YES][YES] - ............ [OKAY][OKAY] - -fused_adam .............fused_adam [NO]............. .......[NO] fused_adam [OKAY]fused_adam ....... - ............. ............. [OKAY]fused_lamb [NO] - [NO] ............. ....... .......fused_lamb [NO][OKAY]............. [OKAY] - ....... -[NO] [OKAY]....... -fused_lamb fused_lamb [OKAY] -.......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] -[NO] .......transformer sparse_attn[OKAY]sparse_attn ............ - ............ ............transformer [NO] [NO][NO]............ ....... .............. [OKAY] [NO] -[OKAY] [OKAY]....... - -stochastic_transformer [OKAY]transformertransformer -. ........................[NO] stochastic_transformer [NO] [NO]....... ........ ....... [OKAY][OKAY] [NO] - - [OKAY]....... - stochastic_transformer[OKAY] -stochastic_transformer. [NO]. .......[NO] [OKAY]....... - [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name ................................................................ installedinstalled installed installed ........ compatiblecompatiblecompatiblecompatible - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -cpu_adam cpu_adamcpu_adam............... ..............................cpu_adam [YES] ............... [YES][YES]......[YES] ...... ...... ......[OKAY] [OKAY][OKAY] -[OKAY] - - -fused_adamfused_adam fused_adam..........................fused_adam .............[NO] [NO] ............. [NO] ....... .............. [NO] [OKAY][OKAY] -[OKAY]....... - - [OKAY]fused_lambfused_lamb -fused_lamb ..........................fused_lamb .............[NO] ............. [NO] .......[NO] [NO]....... .......[OKAY] [OKAY] - -.......[OKAY] -[OKAY] -sparse_attn ............sparse_attn sparse_attn [NO] ............sparse_attn ............ .......[NO] ............[NO][OKAY] -.......[NO]....... transformer [OKAY]....... [OKAY] -............ -[OKAY] -[NO] transformertransformertransformer ....... ........................ ............[OKAY][NO] [NO] -[NO] ....... ....... ....... [OKAY]stochastic_transformer -[OKAY] [OKAY] - -. stochastic_transformer[NO] stochastic_transformerstochastic_transformer........ [OKAY] -.[NO]. [NO].......[NO] ....... [OKAY][OKAY]....... - - [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] [OKAY] -[OKAY] --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- ---------------------------------------------------op name - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -op name op nameop name ................ ................ ................ ................installedinstalledinstalled ..installed.... compatible compatible.. - -compatible -------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninja -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -compatible-------------------------------------------------- - - --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adamcpu_adam ...............[YES]............... .....................[YES][YES] [OKAY]......[YES] ...... - [OKAY] ...... -[OKAY] -[OKAY] -fused_adam ............. fused_adam[NO] fused_adam.............fused_adam .............[NO] ....... ............. [NO]....... [OKAY] [NO] -[OKAY]....... - .......fused_lamb[OKAY] -fused_lamb.............[OKAY] -fused_lamb[NO]............. fused_lamb.......[NO] ............. .............[OKAY] -.......[NO][NO] [OKAY].............. - [OKAY][OKAY] - -sparse_attn ............ [NO] ....... sparse_attn[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -............sparse_attnsparse_attn transformer ............[NO] [NO]............ ............ ....... [NO]....... [NO] [OKAY][OKAY] -....... -async_io [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. ............... -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - --------------------------------------------------- - -....... [OKAY]stochastic_transformertransformer[OKAY] - - [NO] ....... [NO] --------------------------------------------------- -----------------------------------------------------------------------------------------------------op name -op name - op name................................op name ................installed................installed installed..installed.. .... compatiblecompatible -compatible -compatible-------------------------------------------------- - ............ .transformertransformer[NO] ............[NO]................... .......[NO][NO][OKAY] - [OKAY].............. - [OKAY]stochastic_transformer[OKAY] -transformer_inferenceasync_io ................. [NO][NO] .............. [OKAY][NO] - --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - - -utils .................. [YES] ...... [OKAY]transformer_inference -cpu_adamcpu_adam cpu_adamcpu_adam............... ............... ............... [YES]............... [YES] [YES]...... [YES] [OKAY] ............ -...... [OKAY][OKAY] -[OKAY] -.stochastic_transformer stochastic_transformer[NO]. ....... .[NO][OKAY] -[NO]....... .......[OKAY] -[OKAY] - .. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] - -fused_adam ............. [NO] fused_adam....... .............fused_adam[OKAY]fused_adam -utils --------------------------------------------------.................. -[NO] ............. ............. ....... fused_lamb[NO] [NO]............. [OKAY]....... - ....... [NO][OKAY] - fused_lamb.......[OKAY] - [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_lamb[OKAY]............. --------------------------------------------------- -fused_lamb.............[NO] .............[NO]....... [NO].......[OKAY] -[OKAY]....... -sparse_attn [OKAY]............ - [NO] ....... [OKAY] -sparse_attnsparse_attn transformer ............ ............ ............sparse_attn [NO][NO][NO] ............ ..................... [NO][OKAY][OKAY] -[OKAY] -....... -transformer [OKAY]............transformer - [NO]stochastic_transformer............transformer [NO]........ ............ .......[NO][OKAY][NO] -....... [OKAY] ....... -[OKAY]stochastic_transformer - [OKAY] -stochastic_transformer. stochastic_transformer[NO]. ........[NO] [OKAY][NO]....... - .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] async_io....... ...............[NO]async_io - [NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. transformer_inference[NO] utils......... [NO][OKAY].................. - .......[YES] [OKAY]...... - [OKAY] -utils .................. utilsquantizer[YES] ...................................... [YES][NO][OKAY] -............. [OKAY][OKAY] - -quantizer .............. quantizer[NO] -------------------------------------------------- .............. -....... [NO][OKAY] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.------------------------------------------------------------------------------------------------------------------------------------------------------ - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop name op name ................................ ................ ................ installedinstalledinstalled ......installed compatiblecompatible -compatible -.. ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -compatible --------------------------------------------------- -cpu_adamcpu_adam cpu_adam..............................cpu_adam [YES]..............................[YES] ...... [YES]...... [YES] [OKAY]......[OKAY]...... - - [OKAY][OKAY] - -fused_adam .............fused_adam [NO]fused_adamfused_adam............. .......[NO] ............. ............. [OKAY][NO].......[NO] - [OKAY].............. - fused_lamb[OKAY] -fused_lamb[OKAY]............. -.............fused_lamb[NO] [NO] .............fused_lamb ....... .......[NO]............. [OKAY][NO][OKAY]....... - - .......[OKAY] -[OKAY] -sparse_attnsparse_attn sparse_attn........................sparse_attn [NO][NO]........................ ....... ....... [NO][NO] [OKAY][OKAY] ....... -....... - transformer[OKAY][OKAY] transformer - -........................transformertransformer [NO][NO] ...................................... [NO][OKAY][OKAY][NO] - - .............. [OKAY]stochastic_transformer[OKAY] -stochastic_transformer -.stochastic_transformer. [NO]stochastic_transformer[NO] . ....... ....... .[NO] [OKAY][OKAY] - -[NO]....... .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name ................................ ................installedinstalled ................ .. ..installed installed compatible compatible.. -.. - --------------------------------------------------compatiblecompatible-------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam [YES] cpu_adamcpu_adam ............... ...... ............... ...............[YES] [OKAY] [YES][YES] -...... ......[OKAY]...... - [OKAY]fused_adam -[OKAY] ............. - [NO] ....... [OKAY] -fused_adam fused_adam.............fused_lamb ..........................fused_adam[NO] [NO].............[NO]....... .......[NO]....... [OKAY] [OKAY] -[OKAY] - -.......fused_lamb [OKAY] -fused_lamb............. .............[NO]fused_lamb [NO]sparse_attn.................... ...................[NO] [OKAY] [NO] -[OKAY]....... - .......[OKAY] [OKAY] - -transformer sparse_attn............ sparse_attn............ [NO]............ sparse_attn.......[NO][NO] [OKAY].............. -............ [OKAY][OKAY][NO]stochastic_transformer - - ........ transformertransformer[OKAY] -............[NO]............ transformer .......[NO] [NO] ............ [OKAY].............. - [NO][OKAY][OKAY] - -....... [OKAY] -stochastic_transformerstochastic_transformer ..stochastic_transformer [NO][NO] ............... [NO][OKAY] -[OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - --------------------------------------------------- -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inference transformer_inference.. ..[NO] .......[NO] [OKAY]....... - [OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -ninjaninjaninja ninja.................................... ..................[OKAY][OKAY].................. - - [OKAY]-------------------------------------------------- - ---------------------------------------------------[OKAY]-------------------------------------------------- -op name - -op name................op name-------------------------------------------------- ................installed -................ op name..installedinstalled .. ................compatible .. -compatible ---------------------------------------------------installedcompatible --------------------------------------------------- - -..-------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -compatible -cpu_adam--------------------------------------------------cpu_adam -async_io ............... [NO] ....... [NO] - .............................. cpu_adam [YES] [YES] ..................... ...... [OKAY] cpu_adam[YES] -[OKAY] -..................... [OKAY][YES] - ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam .............fused_adam [NO]............. .......[NO]fused_adam .......[OKAY] -.............fused_adam[OKAY] -utils .................. [YES] ...... [OKAY] - fused_lamb[NO]fused_lamb............. .......................... ....... [NO][NO][NO] [OKAY]..................... -[OKAY] [OKAY] -[OKAY] -fused_lamb - ............. fused_lamb[NO] .................... [NO][OKAY] sparse_attn ------------------------------------------------------------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -....... sparse_attn ............ ............[OKAY][NO] - [NO]....... .......[OKAY] -[OKAY]sparse_attn -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system --------------------------------------------------- - ............transformer transformer ............[NO]............ sparse_attn [NO] .......[NO] ............ ....... .......[OKAY][OKAY][NO] - -[OKAY] - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -.......transformer [OKAY]stochastic_transformer............stochastic_transformer - [NO].. [NO]transformer .......[NO] ................... ....... [OKAY][OKAY][OKAY][NO] - - -....... stochastic_transformer[OKAY] -. [NO]stochastic_transformer ....... .[OKAY] -[NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -async_io transformer_inference............... ..[NO] [NO]....... .......[NO] -JIT compiled ops requires ninja-------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninja -[OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -utils .................. [YES] ...... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer_inference quantizer.. ..............[NO] [NO]....... .......[OKAY] -[OKAY] -utils-------------------------------------------------- .................. - [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op nameop name ................................op name................ installed................installedinstalled .. ..installedcompatible.. -compatible..compatible --------------------------------------------------- - -------------------------------------------------- --------------------------------------------------- -compatible - --------------------------------------------------- -cpu_adam cpu_adam...............cpu_adam ............... ............... cpu_adam[YES][YES] ........................... [YES][OKAY][OKAY] - -......[YES] [OKAY] -...... [OKAY] -fused_adam ............. fused_adam[NO] fused_adam ............. ....... ............. [NO] [OKAY] [NO]....... - .......[OKAY] fused_lamb -[OKAY] -.............fused_adamfused_lamb fused_lamb[NO] .......................... ....... ............. [NO][NO] [OKAY] - .......[NO]....... [OKAY][OKAY] - -....... [OKAY] -fused_lambsparse_attn ......................... [NO][NO] .......sparse_attnsparse_attn....... ........................ [NO][NO] [OKAY].......[OKAY]....... - - [OKAY][OKAY] -transformer - ............transformer [NO]transformer............ ....... ............[OKAY][NO]sparse_attn - [NO]................... .......[OKAY]stochastic_transformer - [OKAY] .[NO] -stochastic_transformer [NO]....... stochastic_transformer [OKAY]......... - [OKAY] [NO]transformer - [NO]................... .......[OKAY] - [OKAY] -[NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninjaJIT compiled ops requires ninja --------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY] - -[OKAY][OKAY]---------------------------------------------------------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------op name -op name -op name ................ ................ op name................ installed installed ................ installed .. ..compatibleinstalled.. - -------------------------------------------------- -compatible.. -compatible -compatible-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... [YES] ...... cpu_adam[OKAY] cpu_adam -cpu_adam ............... ...............[YES]............... [YES]......[YES]fused_adam [OKAY] ............ -............. [OKAY][OKAY][NO] - -....... [OKAY] -fused_adam ............. fused_lamb[NO] .............fused_adam....... [NO]fused_adam.............[OKAY] ....... - [NO][OKAY] -.............fused_lamb....... [NO].............[OKAY] -.......[NO] sparse_attn[OKAY] fused_lamb....... -............ fused_lamb[NO].............[OKAY] - ....................[NO] [OKAY] [NO] -....... transformer....... [OKAY] ............ -[OKAY] sparse_attn[NO] -................... [OKAY][NO] - ....... [OKAY] -stochastic_transformer transformersparse_attn. ............ [NO]sparse_attn ............ .......[NO]............ ....... [NO][OKAY][OKAY][NO] - - .............. [OKAY][OKAY] - -stochastic_transformer transformertransformer. ........................[NO] [NO].......[NO] ..............[OKAY] -[OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO] [NO]....... .......[OKAY] -[OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] - [OKAY] -[OKAY]-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------op name -op name -op name ................................op name ................installed................installed installed..installed.. compatible.... -compatible ---------------------------------------------------compatiblecompatible - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adam ............... [YES] cpu_adam......cpu_adam cpu_adam[OKAY]............... ............... - ...............[YES][YES] [YES] ...... ...... ...... [OKAY] -[OKAY]fused_adam[OKAY] - -............. [NO] ....... [OKAY] -fused_lambfused_adamfused_adam .............fused_adam............. ............. [NO] .............[NO] [NO] .............. .......[NO][OKAY][OKAY] -....... -[OKAY] -[OKAY]fused_lamb - fused_lamb............. fused_lamb............. [NO] .............sparse_attn [NO]....... .......[NO][OKAY]............ - [OKAY].......[NO] - [OKAY]....... - [OKAY] -transformersparse_attn ........................ [NO][NO]sparse_attn .......sparse_attn............ ....... [OKAY][NO]............[OKAY] - - .......[NO] [OKAY]stochastic_transformer.......transformer - [OKAY]............ -. transformer [NO]transformer [NO] ................... ............ .......[NO] [OKAY] -[NO][OKAY]....... - stochastic_transformer.......[OKAY] -[OKAY]. - [NO]stochastic_transformer .......stochastic_transformer .[OKAY]. - [NO][NO] .............. [OKAY] -[OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- - -op nameop nameop name op name ................................................................ installedinstalled installed ..installed .. .. compatible..compatiblecompatible - - - ----------------------------------------------------------------------------------------------------compatible-------------------------------------------------- - - - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adam cpu_adamcpu_adam .............................. cpu_adam ...............[YES] [YES] ............... [YES] ............ [YES][OKAY] ...... - [OKAY] ...... -[OKAY] -[OKAY] -async_io ............... [NO] ....... [NO] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -fused_adam fused_adam............. fused_adam .............[NO]fused_adam............. [NO] ....... [NO]............. ....... .......[OKAY][NO][OKAY] -transformer_inference .. [NO] ....... [OKAY] - -[OKAY].......fused_lamb - [OKAY]............. -async_io async_io...............utils .................................[NO] [NO][YES]....... .............[NO] -fused_lamb [NO]fused_lamb ............. .......fused_lamb............. [NO][OKAY] ............. -[OKAY][NO] - -[NO]....... [NO].......[OKAY] -.......[OKAY] -[OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer_inference-------------------------------------------------- -transformer_inference.. ..[NO] [NO]....... .......[OKAY] -sparse_attn sparse_attntransformer............ sparse_attn ........................ [NO] [NO]............ [NO] .............. [NO] ....... [OKAY][OKAY]....... - -[OKAY] -[OKAY] -[OKAY]transformerstochastic_transformer -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] - transformer ............ transformer ............[NO]. ............[NO].......[NO] [NO] [OKAY]....... .............. - [OKAY][OKAY][OKAY] - - -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -stochastic_transformer stochastic_transformer.stochastic_transformer [NO]. ........[NO] [OKAY][NO] -.............. [OKAY][OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name op nameop name................ ................ installed................installed................ installed.... installed .. compatible compatible.. - -compatible -------------------------------------------------- - ---------------------------------------------------compatible-------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam ......cpu_adam...............cpu_adam ...............[OKAY][YES]............... - [YES]......[YES] ......[OKAY]...... - [OKAY][OKAY] -fused_adam - ............. [NO] ....... [OKAY]fused_adam - ............. [NO]fused_adamfused_lamb fused_adam .................... ............. [OKAY] .............[NO] -[NO] [NO].............. fused_lamb ....... [OKAY][OKAY] - ............. -[OKAY] -[NO] fused_lamb....... fused_lamb ............. [OKAY] .............[NO] - sparse_attn.......[NO] [OKAY]................... - [NO][OKAY] -....... [OKAY] -sparse_attn ............transformer [NO]............ .......[NO] sparse_attn[OKAY]....... -[OKAY]............sparse_attn - transformer[NO]............ .......stochastic_transformer[NO] ............ [OKAY] -........[NO] [OKAY][NO]....... - transformer.......[OKAY] -............transformer[OKAY] -[NO]stochastic_transformer............ .......[NO] .[OKAY] -[NO]....... .......[OKAY] -stochastic_transformer[OKAY] -.stochastic_transformer [NO] ........ [OKAY][NO] - ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] - -[OKAY] --------------------------------------------------- - -------------------------------------------------------------------------------------------------------------------------------------------------------op name - - - ................op nameop nameop name installed................................................ ..installedinstalledinstalled .... compatible..compatible - -compatiblecompatible---------------------------------------------------------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adamcpu_adam...............cpu_adam ..............................[YES]............... [YES][YES][YES] ...... .................. [OKAY] -[OKAY][OKAY][OKAY] - - -fused_adam ............. fused_adamfused_adamfused_adam[NO] ................................. ............. [OKAY] [NO] -[NO] [NO]fused_lamb.............. .................... [OKAY] [OKAY] -[OKAY][NO] - - .......fused_lamb [OKAY]fused_lambfused_lamb -....................................... [NO] [NO] [NO] ....... ....... .......[OKAY][OKAY] - -[OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformersparse_attn sparse_attn............ sparse_attn ............ ............[NO] ...................[NO][NO] [NO][OKAY].............. -[OKAY] ....... -[OKAY] stochastic_transformer -[OKAY] -transformer.transformer [NO] transformer........................ .......[NO]............ [NO] [OKAY]....... - [NO].......[OKAY] -.......[OKAY] -[OKAY] -stochastic_transformerstochastic_transformer stochastic_transformer .. .[NO][NO] .......[NO]....... [OKAY].......[OKAY] - -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO]............... .......[NO] ....... [NO][NO] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY] - [OKAY] -async_io ............... [NO] ....... [NO] -utils ..................utils [YES].................. ......[YES] [OKAY]...... -transformer_inference .. [NO] ....... [OKAY] - [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. quantizer[NO] ....... [OKAY] -.............. [NO] .......-------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -ninjaninjaninjaninja .................. .................. .................................... [OKAY] - [OKAY][OKAY][OKAY]-------------------------------------------------- - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- -op name - - op name................ op nameop name................ installed................................ installed .. installed..installed compatible compatible -.. -..-------------------------------------------------- -------------------------------------------------- -compatible -compatible - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam ..............................cpu_adamcpu_adam [YES]...............[YES]............... ...... [YES]...... [YES][OKAY] [OKAY] -...... -...... [OKAY][OKAY] - -fused_adam ............. fused_adam[NO] ....................fused_adam fused_adam [OKAY] [NO] -.......................... [NO].......fused_lamb [NO] ....... ....................[OKAY][OKAY] - -[NO][OKAY] fused_lamb....... -fused_lamb .............[OKAY].............fused_lamb - [NO].............[NO] .......[NO]....... [OKAY][OKAY]....... - -sparse_attn [OKAY]............ - [NO] ....... [OKAY] -transformer sparse_attnsparse_attn............ ............ [NO] sparse_attn [NO]............ ....... ............ .......[NO][OKAY] -[NO] ....... [OKAY].......stochastic_transformer[OKAY] - -[OKAY].transformer -transformer [NO] ............transformer................... [NO] ............[NO] [OKAY] ..............[NO] - ....... [OKAY] -[OKAY][OKAY] - -stochastic_transformer stochastic_transformerstochastic_transformer. [NO]. ........[NO] [NO].......[OKAY] -.......[OKAY] - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer_inference ..utils [NO].................. .......[YES] [OKAY]...... --------------------------------------------------- - [OKAY] -utilsquantizer ................................ [YES][NO] ............. [OKAY][OKAY] - -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] - -utils .................. [YES] ...... [OKAY] -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY][OKAY] - -[OKAY] ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op name--------------------------------------------------op name op name - ................................ op name installed................ installed ................ .. installed installed..compatible - compatible....-------------------------------------------------- - - --------------------------------------------------compatiblecompatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES]cpu_adam ..................... cpu_adam cpu_adam[OKAY] [YES] - .................................... [YES] [YES][OKAY] -............fused_adam [OKAY].............[OKAY] - -[NO] ....... [OKAY]fused_adam - ............. [NO]fused_adam fused_lamb ....... ............. .............fused_adam [OKAY] -[NO][NO]............. fused_lamb ....... [NO]....... ............. [OKAY] ....... -[OKAY][NO] - [OKAY]....... -[OKAY]fused_lamb -fused_lamb .......................... [NO] [NO]....... sparse_attn....... ............[OKAY][OKAY] - -sparse_attn[NO] ................... [NO][OKAY] -....... [OKAY] -transformer ............transformersparse_attn sparse_attn ........................[NO] [NO]...................[NO] [OKAY][NO].............. - .......[OKAY][OKAY] stochastic_transformer - -[OKAY] -.transformerstochastic_transformer [NO] transformer ............. ................... [NO] [OKAY][NO] [NO] - ....... ..............[OKAY] - [OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -------------------------------------------------------------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja --------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at --------------------------------------------------- - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO]utils ......................... [OKAY][YES] - ...... [OKAY] -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer ..............utils [NO].................. .......[YES] [OKAY]...... - [OKAY] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................. .................................... ..................[OKAY] -[OKAY][OKAY][OKAY] --------------------------------------------------- - - --------------------------------------------------- -----------------------------------------------------------------------------------------------------op nameop name - - ................op name................ op name installed installed ................ .................. .. installedinstalledcompatiblecompatible - -..--------------------------------------------------..-------------------------------------------------- - -compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] ..................... [YES]cpu_adam[OKAY]cpu_adam -...... ............... ............... [OKAY] [YES][YES] - ............ [OKAY]fused_adam -[OKAY] -............. [NO] .......fused_adam [OKAY]............. - [NO] fused_adam.......fused_adamfused_lamb .............[OKAY].......................... - [NO][NO][NO] .......fused_lamb.............. [OKAY] ............. -[OKAY] [OKAY] -[NO] - .......fused_lamb fused_lamb [OKAY] ............. -............. [NO] sparse_attn [NO] ....... ............ ....... [OKAY] [NO] -[OKAY] -....... sparse_attn[OKAY] -............ [NO] transformer....... ............[OKAY] -[NO] .......sparse_attn transformersparse_attn [OKAY] ........................ - ............ [NO][NO] stochastic_transformer[NO] .............. ........[OKAY][OKAY] - - [NO][OKAY] transformer -stochastic_transformer....... transformer............[OKAY]. - ............[NO][NO] [NO].............. .......[OKAY][OKAY] - -[OKAY] -stochastic_transformer stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -ninjaninjaninjaninja .................. .................. ....................................[OKAY] [OKAY][OKAY] - - -[OKAY]-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name -op name op name op name................................................ installedinstalled installed.................. ....compatibleinstalled - compatible..--------------------------------------------------compatible - - -compatible---------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... cpu_adam[YES]cpu_adamcpu_adam ................................................... [YES][YES][OKAY][YES] - ............ ...... [OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] .......fused_adam fused_adam fused_adam[OKAY]............. - ..........................fused_lamb[NO] [NO]....................[NO] [NO] [OKAY].............. -....... [OKAY] [OKAY][OKAY] -fused_lamb - -DeepSpeed general environment info: - .............fused_lamb fused_lamb [NO]............. .................... [NO][NO][OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -sparse_attn .......................... [OKAY][OKAY][NO] - -torch version .................... 1.8.1 - ....... [OKAY] -sparse_attn ............transformer [NO]............ .......[NO] sparse_attnsparse_attn[OKAY] -torch cuda version ............... 11.1 -....... ............[OKAY]transformer............ -nvcc version ..................... 11.2 - ............[NO][NO] stochastic_transformer ....... [NO]....... ........[OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -[OKAY][OKAY][NO] -transformer - ................... transformerstochastic_transformer[OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [NO]. ...................[NO] [NO][OKAY]....... - .......[OKAY] -[OKAY]stochastic_transformer - . stochastic_transformer[NO] ........ [OKAY][NO] - ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda version torch version............... 11.1.................... - nvcc version1.8.1 -..................... 11.2torch cuda version - deepspeed install path............... ........... 11.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -DeepSpeed general environment info:deepspeed infonvcc version ................... - .....................0.4.2+bc17042, bc17042, big-science -11.2deepspeed wheel compiled w. - torch install path...... deepspeed install path ...............torch 1.8, cuda 11.1 -........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed info torch version................... .................... 0.4.2+bc17042, bc17042, big-science1.8.1 - -deepspeed wheel compiled w.torch cuda version ..................... 11.1torch 1.8, cuda 11.1 - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO]async_io ...................... [NO][NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -....... [NO] -torch version .................... 1.8.1 -transformer_inference .. transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version ............... 11.1 -torch version .................... 1.8.1 -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -torch cuda version ............... 11.1 -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - -async_io ............... [NO] ....... [NO] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -nvcc version ..................... 11.2 ----------------------------------------------------------------------------------------------------- - -transformer_inference .. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [YES] ...... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -DeepSpeed general environment info: --------------------------------------------------- -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - .................... torch version1.8.1 -.................... 1.8.1torch cuda version - ............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -..................... deepspeed install path11.2 -torch version .................... 1.8.1 -........... deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version ............... 11.1 -...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] -utils .................. - [YES] ...... [OKAY] -quantizer .............. utils[NO] ......................... [OKAY] -[YES]-------------------------------------------------- -...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO]utils ......................... [OKAY] -[YES] ...... [OKAY] -utils ..................quantizer [YES] .................... [NO][OKAY] -....... [OKAY] -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - -JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... --------------------------------------------------- - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................................... .................. .................. [OKAY] [OKAY][OKAY] -[OKAY] - --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -op nameop name op nameop name ................ ................................installed................ installed ..installed installed .. compatible..compatible.. - ---------------------------------------------------compatiblecompatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam cpu_adam............... cpu_adam............... [YES]...............[YES]............... ......[YES] ...... ...... [YES] [OKAY][OKAY] -[OKAY] -...... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adamfused_adam .......................... fused_adam[NO] fused_adam[NO] ........................................ [OKAY][NO][NO][OKAY] - -async_io ............... [NO] ....... [NO] -..............fused_lamb [OKAY]fused_lamb[OKAY]............. - -.............[NO] fused_lamb[NO]fused_lamb....... [OKAY]................................. - [NO][NO][OKAY] -transformer_inference .. [NO] ....... [OKAY] -.............. [OKAY][OKAY] - -sparse_attn ............ [NO] .......sparse_attn [OKAY]............ -utils .................. [YES] ...... [OKAY] - [NO]sparse_attn transformersparse_attn ....... ............ ........................[OKAY] -[NO] [NO] transformer.......[NO] ..........................[OKAY] -quantizer .............. [NO] ....... [OKAY] -[OKAY][OKAY][NO] -transformer --------------------------------------------------- - ................... transformerstochastic_transformer[OKAY] - [NO]............ ........stochastic_transformer [NO][NO]. [OKAY].......[NO]....... - [OKAY]....... -[OKAY]stochastic_transformer - [OKAY] -. stochastic_transformer[NO] ....... .[OKAY] -[NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] async_io....... [NO]............... - [NO] ....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO] .......[NO] -[NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. .............................. [NO] -[NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... async_io .......[OKAY] -...............[OKAY] -[NO] ....... [NO] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -transformer_inferencequantizerquantizer .............................. [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] - - ----------------------------------------------------------------------------------------------------- - -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name -transformer_inference .. [NO] ....... [OKAY] -op name op nameop name ................................ ................installed ................ installed installed.. installed .. compatible.. - ..--------------------------------------------------compatiblecompatible - - -compatible-------------------------------------------------- - -utils .................. [YES] ...... [OKAY] ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES] cpu_adam...... ...............[OKAY] -quantizer .............. [NO] ....... [OKAY] -cpu_adam[YES]cpu_adam .................................... [OKAY][YES][YES] --------------------------------------------------- - fused_adam............ .............[OKAY][OKAY] -[NO] -fused_adam .................... [OKAY][NO] - ....... fused_lamb[OKAY] -fused_adamfused_adam............. fused_lamb .......................... [NO] .............[NO]....... [NO] [NO].......[OKAY] -.............. [OKAY] [OKAY][OKAY] - - -fused_lambfused_lamb .......................... [NO][NO] sparse_attn....... sparse_attn...................[OKAY] -............ [NO][OKAY][NO] - .............. [OKAY][OKAY] - -sparse_attntransformertransformer ........................ ............sparse_attn[NO][NO] [NO]................... .......[OKAY] - .......[NO] [OKAY] [OKAY]stochastic_transformer -....... - transformer [OKAY]............. - stochastic_transformer [NO][NO] transformer ............... [OKAY] [OKAY]............ -[NO] - [NO]....... .......[OKAY]stochastic_transformer - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -. [NO]stochastic_transformer ....... [OKAY]. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. utils[NO] ......................... [YES][OKAY] -...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.utils - quantizer.................. ..............[YES] [NO]...... .......[OKAY] -[OKAY] -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] -async_io-------------------------------------------------- -............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... utils[OKAY] -.................. [YES] quantizer...... ..............[OKAY] -[NO] ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`... [NO] -....... [OKAY] -utils .................. [YES] ......async_io [OKAY]............... - [NO] ....... [NO]quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -[OKAY] -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -utils utils.................. .................. [YES][YES] ............ [OKAY][OKAY] - -transformer_inference .. [NO] transformer_inference....... .. [OKAY][NO] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... [OKAY]quantizer - .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer .............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......async_io [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] quantizer...... ..............[OKAY] -[NO] ....... quantizer[OKAY] -.............. [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -quantizerutils ................................ [YES][NO] ............. [OKAY][OKAY] - -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utilstransformer_inference .................... [YES] ......[NO] [OKAY] -....... quantizer[OKAY] -.............. [NO] ....... [OKAY] -utils-------------------------------------------------- -.................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ............... 11.1 -torch version .................... 1.8.1 -nvcc version ..................... 11.2 -torch cuda version ............... 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -nvcc version ..................... 11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils .................. [YES] utils...... ..................[OKAY] -[YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer .............. [NO] -------------------------------------------------- -....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -utils ..................async_io [YES]............... ......[NO] [OKAY]....... - [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference-------------------------------------------------- -.. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info:nvcc version .....................DeepSpeed general environment info: -11.2 - -deepspeed install path ...........torch install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch install path............... - deepspeed info............... ................... 0.4.2+bc17042, bc17042, big-science -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed wheel compiled w. -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']...... -torch versiontorch 1.8, cuda 11.1 -torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -nvcc version nvcc version..................... ..................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... DeepSpeed general environment info:11.211.2 - -deepspeed install path -deepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info...............deepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] ...... -...... torch 1.8, cuda 11.1torch 1.8, cuda 11.1torch version - - .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] async_io....... [NO]............... -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -transformer_inference .. [NO]utils ......................... [OKAY][YES] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - ...... [OKAY] -utilsquantizer ................................ [YES][NO] ............. [OKAY][OKAY] - -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] - ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... utils[OKAY] -.................. [YES] quantizer...... ..............[OKAY] -[NO] ....... [OKAY] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY]quantizer - .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... --------------------------------------------------- -JIT compiled ops requires ninja - [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io transformer_inference............... ..[NO] [NO]....... .......[NO] -[OKAY] -utils .................. [YES] ...... [OKAY]transformer_inference - .. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] -utils ..................-------------------------------------------------- -[YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -.................... 1.8.1torch version - ....................torch cuda version 1.8.1............... - 11.1 -torch cuda version nvcc version............... .....................11.1 -11.2 -nvcc version deepspeed install path..................... ...........11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - ...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-science -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................. .................. ..................[OKAY][OKAY].................. - - [OKAY]----------------------------------------------------------------------------------------------------[OKAY] - - - ---------------------------------------------------op nameop name - --------------------------------------------------................................op name - installedinstalled................op name .... ................ compatibleinstalled compatible - -installed..-------------------------------------------------- -..--------------------------------------------------compatible - -compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ...............cpu_adam [YES]............... ......cpu_adam[YES]cpu_adam [OKAY]............... -/bin/sh: line 0: type: git: not found -............... [YES][YES] .................. [OKAY][OKAY][OKAY] - -fused_adam ............. -[NO] ....... [OKAY] -fused_adamfused_adam fused_lamb.......................... .............[NO]fused_adam[NO] [NO] ....... ....... .............[OKAY] ....... -[NO][OKAY] -[OKAY].......fused_lamb - ............. [NO] [OKAY]fused_lamb....... - .............[OKAY]fused_lamb - sparse_attn[NO] ................................ [NO][OKAY][NO] -....... [OKAY]sparse_attn - ............transformer .......[NO]............ .......[NO]sparse_attn[OKAY] ............ -.......[OKAY] -[NO][OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer....... ............[OKAY] stochastic_transformer[NO] -async_io ............... [NO] ....... [NO] - transformer........ [NO]............[OKAY]sparse_attn - .......[NO]............ stochastic_transformer [OKAY]....... -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] - [NO].[OKAY] -.......[NO]stochastic_transformer [OKAY]....... -utils .................. [YES] ...... [OKAY] - .[OKAY]transformer -quantizer .............. [NO] ....... [OKAY] -[NO] ....... ............[OKAY] --------------------------------------------------- -[NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... async_io[NO] ...................... [NO] ....... [NO][NO] - -transformer_inference transformer_inference.. [NO] ....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO] -[NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer_inferencetransformer_inference .... [NO][NO] ....... .......[OKAY] -[OKAY] -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] - ----------------------------------------------------------------------------------------------------- -[OKAY] - -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info:DeepSpeed general environment info: - - -torch install pathtorch install pathtorch install path ............................................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - -torch versiontorch versiontorch version ............................................................ 1.8.11.8.11.8.1 - - -torch cuda versiontorch cuda versiontorch cuda version ............................................. 11.111.111.1 - - -nvcc versionnvcc versionnvcc version ............................................................... 11.211.211.2 - - -deepspeed install pathdeepspeed install pathdeepspeed install path ................................. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - - -deepspeed infodeepspeed infodeepspeed info ......................................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - - -deepspeed wheel compiled w.deepspeed wheel compiled w. deepspeed wheel compiled w. ...... ...... ...... torch 1.8, cuda 11.1 torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] async_io....... ...............[NO] -[NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utilsutils .................................... [YES] [YES]...... ......[OKAY] - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -utils .................. [YES] ...... [OKAY] -torch version .................... 1.8.1 -quantizer .............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. utils[YES] ........................ [YES] [OKAY]...... - [OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version ....................torch cuda version ...............1.8.1 -11.1 -nvcc versiontorch cuda version .................................... 11.211.1 - -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. utils[NO] ......................... [YES][OKAY] -...... [OKAY] -utilsquantizer ................................ [YES][NO] ............. [OKAY][OKAY] - -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_ioasync_io .............................. [NO][NO] ....... .......[NO] -[NO] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] [OKAY] - -nvcc version ..................... 11.2 -utils ..................utils [YES].................. ......[YES] [OKAY]...... -[OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - 11.2deepspeed info - ...................deepspeed install path 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ................. torch 1.8, cuda 11.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info..................... ...................11.2 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version torch cuda version.................... ...............1.8.1 -11.1 -async_io ............... [NO] ....... [NO] -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path -transformer_inference .. [NO] ....... [OKAY] - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -utils .................. [YES] ...... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -DeepSpeed general environment info: -torch versiontorch version ........................................ 1.8.11.8.1 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - ...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] 0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1torch cuda version - torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO] -async_io ...............async_io [NO] ...................... [NO][NO] -async_io ............... [NO] .......transformer_inference [NO].. -....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] - [NO] ....... [OKAY] -DeepSpeed general environment info: -[OKAY] -utils ..................transformer_inference [YES].. ......[NO] [OKAY]....... - [OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer .............. [NO] utils....... ..................[OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] -torch version .................... 1.8.1 - -[YES] ...... [OKAY]-------------------------------------------------- - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 ----------------------------------------------------------------------------------------------------- - -quantizer .............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. [NO] .......quantizer [OKAY].............. - [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... DeepSpeed general environment info:1.8.1 - -torch cuda version ............... 11.1 -torch install pathnvcc version .................................... 11.2 -deepspeed install path ........... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info torch version................... ....................0.4.2+bc17042, bc17042, big-science -1.8.1deepspeed wheel compiled w. - ......torch cuda version torch 1.8, cuda 11.1............... - 11.1 -DeepSpeed general environment info: -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info:nvcc version ..................... -11.2 -deepspeed install path ........... torch install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...............deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']...... -torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version .................... 1.8.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -async_io ............... [NO] ....... [NO] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... DeepSpeed general environment info:0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. -transformer_inference .. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizer .............. [NO] ....... [OKAY] -torch version .................... 1.8.1 --------------------------------------------------- - -torch cuda version ............... 11.1 -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -nvcc version ..................... 11.2 -torch version torch version.................... ....................1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -11.1nvcc version -.....................nvcc version 11.2..................... -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -...... deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -torch install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - ............... torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch cuda version - ............... torch version11.1 -.................... nvcc version1.8.1 ..................... -DeepSpeed general environment info: - 11.2torch cuda version - ...............deepspeed install path 11.1........... - nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - deepspeed info11.2 -...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...... - deepspeed infotorch 1.8, cuda 11.1 -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer ..............  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.[NO] ....... -[OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -utils async_io.................. [YES]............... ......[NO] [OKAY]....... - [NO] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -nvcc versionnvcc version .......................................... 11.2 -11.2 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -/bin/sh: line 0: type: git: not found -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... torch install path1.8.1 -............... torch cuda version ............... 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']nvcc version - ..................... torch version11.2 -....................deepspeed install path 1.8.1........... - torch cuda version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...............deepspeed info 11.1................... - nvcc version0.4.2+bc17042, bc17042, big-science -..................... deepspeed wheel compiled w.11.2 -...... deepspeed install pathtorch 1.8, cuda 11.1 -........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -/bin/sh: line 0: type: git: not found -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ............... 11.1 -torch version .................... 1.8.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch cuda version ............... 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1 -torch cuda version torch cuda version............... ............... 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version ............... 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. -......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version ....................torch cuda version ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc versionDeepSpeed general environment info: ..................... 11.2 - -deepspeed install path ........... torch install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -...............deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']...... - torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1 -nvcc version nvcc version..................... .....................11.2 -11.2 -deepspeed install path deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... 11.2..................... - 11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] 0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch version ................................... 1.8.1 -torch cuda version ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -11.1 -torch versionnvcc version ......................................... 1.8.111.2 - -deepspeed install path torch cuda version........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1 - -deepspeed infonvcc version ........................................ 0.4.2+bc17042, bc17042, big-science11.2 - -deepspeed wheel compiled w.deepspeed install path ................. torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info:torch cuda version ............... - 11.1 -nvcc version torch install path..................... 11.2............... - deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed info - ................... torch version0.4.2+bc17042, bc17042, big-science -....................deepspeed wheel compiled w. 1.8.1...... - torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -DeepSpeed general environment info: -torch install pathtorch install path .............................. torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch versiontorch version - ........................................ torch version 1.8.1 1.8.1 -.................... - torch cuda version1.8.1torch cuda version - .............................. torch cuda version 11.1 11.1 -............... -nvcc version nvcc version 11.1 ..................... -..................... nvcc version 11.2 11.2 -..................... -deepspeed install path deepspeed install path 11.2 ........... -........... deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...................................... deepspeed info 0.4.2+bc17042, bc17042, big-science 0.4.2+bc17042, bc17042, big-science -................... - deepspeed wheel compiled w.deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science ...... - ......deepspeed wheel compiled w.torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -/bin/sh: line 0: type: git: not found -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... torch version1.8.1 -.................... 1.8.1torch cuda version - ............... torch cuda version11.1 -............... nvcc version11.1 -..................... nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - --------------------------------------------------- -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ...... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -fused_adam ............. [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info:DeepSpeed general environment info: - - -fused_lamb ............. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -torch install pathtorch install path torch install path.............................. ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch versiontorch version torch version........................................ ....................1.8.11.8.1 - -1.8.1 -torch cuda versiontorch cuda version torch cuda version ............... ..............................11.1 -11.111.1nvcc version - -sparse_attn ............ [NO] ....... [OKAY] - nvcc versionnvcc version..................... ..........................................11.2 -11.211.2deepspeed install path - - deepspeed install pathdeepspeed install path........... ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - - deepspeed infodeepspeed info................... ...................0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.......deepspeed wheel compiled w. ......torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ...... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path torch version............... .................... 1.8.1 -torch cuda version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - 11.1 -torch versionnvcc version ......................................... 1.8.111.2 - -deepspeed install path torch cuda version........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1 - -deepspeed infonvcc version ........................................ 0.4.2+bc17042, bc17042, big-science11.2 - -deepspeed wheel compiled w.deepspeed install path ................. torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... 1.8.1torch version - ....................torch cuda version 1.8.1............... -11.1 -nvcc versiontorch cuda version ..................... ...............11.2 -11.1deepspeed install path -...........nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - deepspeed info11.2 ................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed install path - deepspeed wheel compiled w............ ...... torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - ....................torch cuda version 1.8.1............... - 11.1torch cuda version -/bin/sh: line 0: type: git: not found - ...............nvcc version 11.1..................... - 11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info - deepspeed wheel compiled w.................... ......0.4.2+bc17042, bc17042, big-science -torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... torch version1.8.1 -.................... 1.8.1torch cuda version - ............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -..................... deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -/bin/sh: line 0: type: git: not found -deepspeed info deepspeed info................... ................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info:deepspeed info ................... -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ......torch install path torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -utils .................. [YES] ...... [OKAY] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ...................... [NO][NO] - ....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -.............. [NO] .......quantizer [OKAY].............. --------------------------------------------------- - [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info DeepSpeed general environment info:................... 0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - - - -JIT compiled ops requires ninja-------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -transformer_inference .. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch install path - ............... torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch cuda version - ............... torch version11.1 -....................nvcc version 1.8.1..................... - 11.2torch cuda version - deepspeed install path............... ...........11.1 -nvcc version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -.....................deepspeed info 11.2................... - deepspeed install path0.4.2+bc17042, bc17042, big-science -...........deepspeed wheel compiled w. ......['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................. .................. .................................... [OKAY] [OKAY] -[OKAY] -[OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op nameop name -op name ................................ op name ................installed installed installed .................. .... compatible installed -compatiblecompatible -------------------------------------------------- - -.. ----------------------------------------------------------------------------------------------------- - -compatible --------------------------------------------------- -cpu_adam ............... cpu_adam[YES]cpu_adam .....................cpu_adam [YES]..............................[OKAY] - ......[YES] [YES]......[OKAY] -......[OKAY] -[OKAY]fused_adam - ............. [NO] ....... [OKAY]fused_adam -fused_adam .............fused_lambfused_adam .............[NO].......................... ....... [NO][NO] [NO] .......[OKAY]....... - .......[OKAY] [OKAY] - -fused_lamb[OKAY] -/bin/sh: line 0: type: git: not found -.............fused_lamb fused_lamb [NO] ................................. [NO][NO][OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn....... ....... ............ [OKAY] [OKAY] -[NO] - ....... [OKAY] -/bin/sh: line 0: type: git: not found -sparse_attntransformer ........................ [NO][NO] .............. sparse_attn sparse_attn[OKAY][OKAY] - - ........................ stochastic_transformer[NO] transformer [NO] ....... ............. .......[NO][OKAY][NO] - [OKAY].............. - [OKAY]transformer[OKAY] - ............ -transformer stochastic_transformer [NO] .................... [NO][NO][OKAY] -.............. [OKAY] -stochastic_transformer[OKAY] -. [NO] stochastic_transformer....... [OKAY] -. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -using world size: 256, data-parallel-size: 8, tensor-model-parallel size: 4, pipeline-model-parallel size: 8 -using torch.float16 for parameters ... -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found ------------------------- arguments ------------------------ -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - accumulate_allreduce_grads_in_fp32 .............. False -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - adam_beta1 ...................................... 0.9 - adam_beta2 ...................................... 0.999 - adam_eps ........................................ 1e-08 - adlr_autoresume ................................. False - adlr_autoresume_interval ........................ 1000 - apply_query_key_layer_scaling ................... True - apply_residual_connection_post_layernorm ........ False - attention_dropout ............................... 0.1 - attention_softmax_in_fp32 ....................... False - bert_binary_head ................................ True - bert_load ....................................... None - bf16 ............................................ False - bias_dropout_fusion ............................. True - bias_gelu_fusion ................................ True - biencoder_projection_dim ........................ 0 - biencoder_shared_query_context_model ............ False - block_data_path ................................. None - checkpoint_activations .......................... True - checkpoint_in_cpu ............................... False - checkpoint_num_layers ........................... 1 - clip_grad ....................................... 1.0 - codecarbon_dir .................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/codecarbon - consumed_train_samples .......................... 0 - consumed_valid_samples .......................... 0 - contigious_checkpointing ........................ False - cpu_optimizer ................................... False - cpu_torch_adam .................................. False - data_impl ....................................... mmap - data_parallel_size .............................. 8 - data_path ....................................... ['/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document'] - dataloader_type ................................. single - DDP_impl ........................................ local - decoder_seq_length .............................. None - deepscale ....................................... False - deepscale_config ................................ None - deepspeed ....................................... True - deepspeed_activation_checkpointing .............. True - deepspeed_config ................................ ./ds_config.1185609.json - deepspeed_mpi ................................... False - distribute_checkpointed_activations ............. False - distributed_backend ............................. nccl - embedding_path .................................. None - encoder_seq_length .............................. 2048 - eod_mask_loss ................................... False - eval_interval ................................... 1000 - eval_iters ...................................... 5 - evidence_data_path .............................. None - exit_duration_in_mins ........................... 110 - exit_interval ................................... None - ffn_hidden_size ................................. 20480 - finetune ........................................ False - fp16 ............................................ True - fp16_lm_cross_entropy ........................... False - fp32_residual_connection ........................ False - global_batch_size ............................... 2048 - hidden_dropout .................................. 0.1 - hidden_size ..................................... 16384 - hysteresis ...................................... 2 - ict_head_size ................................... None - ict_load ........................................ None - img_dim ......................................... 224 - indexer_batch_size .............................. 128 - indexer_log_interval ............................ 1000 - init_method_std ................................. 0.02 - init_method_xavier_uniform ...................... False - initial_loss_scale .............................. 4294967296 - kv_channels ..................................... 512 - layernorm_epsilon ............................... 1e-05 - lazy_mpu_init ................................... None - load ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - local_rank ...................................... 0 - log_batch_size_to_tensorboard ................... True - log_interval .................................... 10 - log_learning_rate_to_tensorboard ................ True - log_loss_scale_to_tensorboard ................... True - log_num_zeros_in_grad ........................... False - log_params_norm ................................. False - log_timers_to_tensorboard ....................... True - log_validation_ppl_to_tensorboard ............... True - loss_scale ...................................... 12.0 - loss_scale_window ............................... 1000 - lr .............................................. 6e-05 - lr_decay_iters .................................. None - lr_decay_samples ................................ 126953125 - lr_decay_style .................................. cosine - lr_warmup_fraction .............................. None - lr_warmup_iters ................................. 0 - lr_warmup_samples ............................... 216320 - make_vocab_size_divisible_by .................... 128 - mask_prob ....................................... 0.15 - masked_softmax_fusion ........................... True - max_position_embeddings ......................... 2048 - memory_centric_tiled_linear ..................... False - merge_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt - micro_batch_size ................................ 1 - min_loss_scale .................................. 1.0 - min_lr .......................................... 6e-06 - mmap_warmup ..................................... False - no_load_optim ................................... None - no_load_rng ..................................... None - no_save_optim ................................... None - no_save_rng ..................................... None - num_attention_heads ............................. 32 - num_channels .................................... 3 - num_classes ..................................... 1000 - num_layers ...................................... 32 - num_layers_per_virtual_pipeline_stage ........... None - num_workers ..................................... 2 - onnx_safe ....................................... None - openai_gelu ..................................... False - optimizer ....................................... adam - override_lr_scheduler ........................... False - params_dtype .................................... torch.float16 - partition_activations ........................... False - patch_dim ....................................... 16 - pipeline_model_parallel_size .................... 8 - position_embedding_type ......................... PositionEmbeddingType.absolute - profile_backward ................................ False - query_in_block_prob ............................. 0.1 - rampup_batch_size ............................... ['16', '16', '6_000_000'] - rank ............................................ 0 - remote_device ................................... none - reset_attention_mask ............................ False - reset_position_ids .............................. False - retriever_report_topk_accuracies ................ [] - retriever_score_scaling ......................... False - retriever_seq_length ............................ 256 - sample_rate ..................................... 1.0 - save ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - save_interval ................................... 1500 - scatter_gather_tensors_in_pipeline .............. True - scattered_embeddings ............................ False - seed ............................................ 42 - seq_length ...................................... 2048 - sgd_momentum .................................... 0.9 - short_seq_prob .................................. 0.1 - split ........................................... 949,50,1 - split_transformers .............................. False - synchronize_each_layer .......................... False - tensor_model_parallel_size ...................... 4 - tensorboard_dir ................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/tensorboard - tensorboard_log_interval ........................ 1 - tensorboard_queue_size .......................... 5 - tile_factor ..................................... 1 - titles_data_path ................................ None - tokenizer_name_or_path .......................... None - tokenizer_type .................................. GPT2BPETokenizer - train_iters ..................................... None - train_samples ................................... 300000000 - use_checkpoint_lr_scheduler ..................... False - use_contiguous_buffers_in_ddp ................... False - use_cpu_initialization .......................... None - use_one_sent_docs ............................... False - use_pin_memory .................................. False - virtual_pipeline_model_parallel_size ............ None - vocab_extra_ids ................................. 0 - vocab_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json - weight_decay .................................... 0.1 - world_size ...................................... 256 - zero_allgather_bucket_size ...................... 0.0 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - zero_contigious_gradients ....................... False - zero_reduce_bucket_size ......................... 0.0 - zero_reduce_scatter ............................. False - zero_stage ...................................... 1 --------------------- end of arguments --------------------- -will use batch size rampup starting from global batch size 16 to global batch size 2048 with batch size increments 16 over 6000000 samples. -> building GPT2BPETokenizer tokenizer ... -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']DeepSpeed general environment info: -torch version - .................... 1.8.1 -torch install pathtorch cuda version .............................. 11.1 -nvcc version ..................... 11.2 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed install path - ........... torch version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -.................... deepspeed info1.8.1 -................... 0.4.2+bc17042, bc17042, big-sciencetorch cuda version - ...............deepspeed wheel compiled w. 11.1...... - torch 1.8, cuda 11.1nvcc version - ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... DeepSpeed general environment info:['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. -...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version DeepSpeed general environment info:.................... 1.8.1 - -torch cuda version ............... 11.1torch install path - nvcc version............... ..................... 11.2 -deepspeed install path ...........['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch versiondeepspeed info ....................................... 1.8.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.torch cuda version ..................... torch 1.8, cuda 11.111.1 - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer quantizer.............. ..............[NO] [NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference ..utils [NO].................. .......[YES] ......[OKAY] -[OKAY] -utilsquantizer ................................ [YES][NO] ............. [OKAY][OKAY] - -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1torch cuda version - ............... torch cuda version11.1 -...............nvcc version 11.1..................... - 11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -........... deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> setting tensorboard ... -> setting codecarbon ... --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name op name op name................................ ................installed................installed installed installed .. ...... compatible compatiblecompatiblecompatible - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- - -cpu_adamcpu_adamcpu_adamcpu_adam ............................................................ [YES][YES][YES] [YES] ............ ...... ...... [OKAY][OKAY] [OKAY] - -[OKAY] - -fused_adamfused_adam fused_adamfused_adam.......................... .............[NO].............[NO] [NO]....... [NO] ....... [OKAY]....... ....... - [OKAY][OKAY][OKAY] - - -fused_lamb fused_lambfused_lambfused_lamb ............. ..........................[NO]............. [NO] [NO]..............[NO] [OKAY] .......[OKAY] -....... - [OKAY][OKAY] - -sparse_attnsparse_attn ............sparse_attn sparse_attn ............[NO]........................ [NO][NO].......[NO] [OKAY] .............. -....... [OKAY][OKAY][OKAY] - -transformer - transformertransformer............transformer ....................................[NO] [NO].......[NO][NO] ..............[OKAY] ....... - [OKAY][OKAY][OKAY] - - -stochastic_transformer stochastic_transformer.stochastic_transformerstochastic_transformer [NO] ........ .[NO].[OKAY] - .......[NO][NO] [OKAY].............. - [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... quantizer[OKAY] -.............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -> initializing torch distributed ... -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version ...............torch cuda version 11.1............... - nvcc version11.1 -.....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... ...............[YES] [YES]...... ......[OKAY] -[OKAY] -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -transformer transformer............ ............[NO] [NO]....... .......[OKAY] -[OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] transformer_inference....... ..[NO] -[NO] ....... [OKAY] -utils ..................transformer_inference [YES].. ......[NO] [OKAY]....... - [OKAY] -quantizerutils ................................ [NO][YES] ............. [OKAY][OKAY] - ---------------------------------------------------quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] - -[OKAY]-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------op name - op name--------------------------------------------------................ op name - ................installedop name................ installed..................installed compatible....installed - compatible..--------------------------------------------------compatible - - -compatible-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- -cpu_adam cpu_adam............... ...............cpu_adam[YES]cpu_adam [YES]..................... ............... ......[YES] [OKAY] -[YES]......[OKAY] -[OKAY]...... - [OKAY] -fused_adam ............. [NO] fused_adamfused_adam....... fused_adam .............[OKAY] -............. .............[NO] fused_lamb [NO].............[NO]....... ....... .......[NO] [OKAY] [OKAY] -....... -[OKAY] -[OKAY]fused_lamb - fused_lamb.............fused_lamb .............[NO]............. [NO] ....... [NO] ....... [OKAY] sparse_attn.......[OKAY] - -............[OKAY] -[NO] ....... [OKAY] -transformer ............ [NO]sparse_attn sparse_attn sparse_attn................... [OKAY]............[NO]............ - [NO].......[NO] stochastic_transformer .......[OKAY] ....... - .[OKAY]transformer - [OKAY] ............ -transformer[NO] transformer [NO]....... ................... ............[OKAY][OKAY][NO] - - [NO]....... .......[OKAY]stochastic_transformer - [OKAY] -. stochastic_transformer[NO] .......stochastic_transformer. [OKAY][NO] - ........ [NO][OKAY] -....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`............... [NO] ....... - [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY]utils - .................. quantizer[YES] .................... [NO][OKAY] -....... [OKAY] -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ...................................................... [OKAY].................. [OKAY] [OKAY] - -[OKAY] ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------op nameop nameop name - ................op name ................ ................ installed installed................installed.. ....compatibleinstalled - compatiblecompatible--------------------------------------------------.. - - - ---------------------------------------------------------------------------------------------------- -compatible - --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adam[YES]............... cpu_adam......[YES]............... .....................[YES][OKAY] - [OKAY][YES]...... - ......[OKAY] -[OKAY] -fused_adam .............fused_adam [NO]fused_adam ....................fused_adam [NO]............. .............[OKAY] -....... [NO] [NO]fused_lamb[OKAY] - ........................... [OKAY][NO][OKAY]fused_lamb - -.................... fused_lamb[NO][OKAY]fused_lamb - ................................. [OKAY] [NO] -[NO] .............. [OKAY][OKAY] - -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] -[NO] sparse_attn.......sparse_attn transformer .................................... [OKAY] [NO][NO] - [NO].............. transformer ....... [OKAY] -[OKAY][OKAY]............ - - transformer[NO] transformer ............ .......stochastic_transformer ............ [NO] [OKAY] [NO]. -....... .......[NO][OKAY] stochastic_transformer -....... [OKAY] [OKAY]. - -stochastic_transformer [NO] .......stochastic_transformer. [OKAY][NO] - ........ [NO][OKAY] -....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inferenceutils .................... [NO][YES] ............. [OKAY][OKAY] - -utilsquantizer ................................ [YES][NO] ............. [OKAY][OKAY] - -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - - ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op name-------------------------------------------------- - - -................op nameop nameop name installed ................................ ................ .. installed installedinstalled compatible .. -.... -------------------------------------------------- compatiblecompatible - -compatible - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... [YES] ......cpu_adam cpu_adam [OKAY]cpu_adam............... - ..............................[YES] [YES][YES]...... ............[OKAY]fused_adam - [OKAY][OKAY]............. - - [NO] ....... [OKAY] -fused_adam fused_lamb.............fused_adam fused_adam............. [NO] ............. [NO]............. ....... [NO] [NO].......[OKAY] - .......[OKAY]....... - [OKAY]fused_lamb[OKAY] - -............. [NO]fused_lambfused_lamb .................... [OKAY].............[NO] -sparse_attn [NO]................... .......[OKAY][NO] - [OKAY]....... - [OKAY] -transformersparse_attn ........................ [NO][NO]sparse_attn sparse_attn.......................... [OKAY] ............ -[NO] [OKAY] [NO] -transformer....... ...................[OKAY] stochastic_transformer[NO] -[OKAY] -.......transformer .transformer [OKAY] ............ -............[NO] [NO][NO] ....... ..............stochastic_transformer[OKAY] -[OKAY][OKAY] -. - [NO] stochastic_transformer.......stochastic_transformer [OKAY]. -. [NO][NO] .............. [OKAY][OKAY] - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... torch version1.8.1 -.................... torch cuda version1.8.1 -............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -..................... deepspeed install path11.2 -........... deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> initializing tensor model parallel with size 4 -> initializing pipeline model parallel with size 8 -> setting random seeds to 42 ... -[2021-09-25 02:35:45,964] [INFO] [checkpointing.py:226:model_parallel_cuda_manual_seed] > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 2760 and data parallel seed: 42 -> compiling dataset index builder ... -make: Entering directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' -make: Nothing to be done for 'default'. -make: Leaving directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' ->>> done with dataset index builder. Compilation time: 0.305 seconds -> compiling and loading fused kernels ... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_upper_triang_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_upper_triang_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module fused_mix_prec_layer_norm_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module fused_mix_prec_layer_norm_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( ->>> done with compiling and loading fused kernels. Compilation time: 20.734 seconds -time to initialize megatron (seconds): -8.955 -[after megatron is initialized] datetime: 2021-09-25 02:36:07 -building GPT model ... -[2021-09-25 02:36:07,098] [INFO] [utils.py:680:see_memory_usage] Before Building Model -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:373: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved - warnings.warn( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:381: FutureWarning: torch.cuda.max_memory_cached has been renamed to torch.cuda.max_memory_reserved - warnings.warn( -[2021-09-25 02:36:07,100] [INFO] [utils.py:681:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB -[2021-09-25 02:36:07,101] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 36.67 GB, percent = 19.6% -SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None -Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=1, data=0, model=0): 32, ProcessCoord(pipe=1, data=0, model=1): 33, ProcessCoord(pipe=1, data=0, model=2): 34, ProcessCoord(pipe=1, data=0, model=3): 35, ProcessCoord(pipe=1, data=1, model=0): 36, ProcessCoord(pipe=1, data=1, model=1): 37, ProcessCoord(pipe=1, data=1, model=2): 38, ProcessCoord(pipe=1, data=1, model=3): 39, ProcessCoord(pipe=1, data=2, model=0): 40, ProcessCoord(pipe=1, data=2, model=1): 41, ProcessCoord(pipe=1, data=2, model=2): 42, ProcessCoord(pipe=1, data=2, model=3): 43, ProcessCoord(pipe=1, data=3, model=0): 44, ProcessCoord(pipe=1, data=3, model=1): 45, ProcessCoord(pipe=1, data=3, model=2): 46, ProcessCoord(pipe=1, data=3, model=3): 47, ProcessCoord(pipe=1, data=4, model=0): 48, ProcessCoord(pipe=1, data=4, model=1): 49, ProcessCoord(pipe=1, data=4, model=2): 50, ProcessCoord(pipe=1, data=4, model=3): 51, ProcessCoord(pipe=1, data=5, model=0): 52, ProcessCoord(pipe=1, data=5, model=1): 53, ProcessCoord(pipe=1, data=5, model=2): 54, ProcessCoord(pipe=1, data=5, model=3): 55, ProcessCoord(pipe=1, data=6, model=0): 56, ProcessCoord(pipe=1, data=6, model=1): 57, ProcessCoord(pipe=1, data=6, model=2): 58, ProcessCoord(pipe=1, data=6, model=3): 59, ProcessCoord(pipe=1, data=7, model=0): 60, ProcessCoord(pipe=1, data=7, model=1): 61, ProcessCoord(pipe=1, data=7, model=2): 62, ProcessCoord(pipe=1, data=7, model=3): 63, ProcessCoord(pipe=2, data=0, model=0): 64, ProcessCoord(pipe=2, data=0, model=1): 65, ProcessCoord(pipe=2, data=0, model=2): 66, ProcessCoord(pipe=2, data=0, model=3): 67, ProcessCoord(pipe=2, data=1, model=0): 68, ProcessCoord(pipe=2, data=1, model=1): 69, ProcessCoord(pipe=2, data=1, model=2): 70, ProcessCoord(pipe=2, data=1, model=3): 71, ProcessCoord(pipe=2, data=2, model=0): 72, ProcessCoord(pipe=2, data=2, model=1): 73, ProcessCoord(pipe=2, data=2, model=2): 74, ProcessCoord(pipe=2, data=2, model=3): 75, ProcessCoord(pipe=2, data=3, model=0): 76, ProcessCoord(pipe=2, data=3, model=1): 77, ProcessCoord(pipe=2, data=3, model=2): 78, ProcessCoord(pipe=2, data=3, model=3): 79, ProcessCoord(pipe=2, data=4, model=0): 80, ProcessCoord(pipe=2, data=4, model=1): 81, ProcessCoord(pipe=2, data=4, model=2): 82, ProcessCoord(pipe=2, data=4, model=3): 83, ProcessCoord(pipe=2, data=5, model=0): 84, ProcessCoord(pipe=2, data=5, model=1): 85, ProcessCoord(pipe=2, data=5, model=2): 86, ProcessCoord(pipe=2, data=5, model=3): 87, ProcessCoord(pipe=2, data=6, model=0): 88, ProcessCoord(pipe=2, data=6, model=1): 89, ProcessCoord(pipe=2, data=6, model=2): 90, ProcessCoord(pipe=2, data=6, model=3): 91, ProcessCoord(pipe=2, data=7, model=0): 92, ProcessCoord(pipe=2, data=7, model=1): 93, ProcessCoord(pipe=2, data=7, model=2): 94, ProcessCoord(pipe=2, data=7, model=3): 95, ProcessCoord(pipe=3, data=0, model=0): 96, ProcessCoord(pipe=3, data=0, model=1): 97, ProcessCoord(pipe=3, data=0, model=2): 98, ProcessCoord(pipe=3, data=0, model=3): 99, ProcessCoord(pipe=3, data=1, model=0): 100, ProcessCoord(pipe=3, data=1, model=1): 101, ProcessCoord(pipe=3, data=1, model=2): 102, ProcessCoord(pipe=3, data=1, model=3): 103, ProcessCoord(pipe=3, data=2, model=0): 104, ProcessCoord(pipe=3, data=2, model=1): 105, ProcessCoord(pipe=3, data=2, model=2): 106, ProcessCoord(pipe=3, data=2, model=3): 107, ProcessCoord(pipe=3, data=3, model=0): 108, ProcessCoord(pipe=3, data=3, model=1): 109, ProcessCoord(pipe=3, data=3, model=2): 110, ProcessCoord(pipe=3, data=3, model=3): 111, ProcessCoord(pipe=3, data=4, model=0): 112, ProcessCoord(pipe=3, data=4, model=1): 113, ProcessCoord(pipe=3, data=4, model=2): 114, ProcessCoord(pipe=3, data=4, model=3): 115, ProcessCoord(pipe=3, data=5, model=0): 116, ProcessCoord(pipe=3, data=5, model=1): 117, ProcessCoord(pipe=3, data=5, model=2): 118, ProcessCoord(pipe=3, data=5, model=3): 119, ProcessCoord(pipe=3, data=6, model=0): 120, ProcessCoord(pipe=3, data=6, model=1): 121, ProcessCoord(pipe=3, data=6, model=2): 122, ProcessCoord(pipe=3, data=6, model=3): 123, ProcessCoord(pipe=3, data=7, model=0): 124, ProcessCoord(pipe=3, data=7, model=1): 125, ProcessCoord(pipe=3, data=7, model=2): 126, ProcessCoord(pipe=3, data=7, model=3): 127, ProcessCoord(pipe=4, data=0, model=0): 128, ProcessCoord(pipe=4, data=0, model=1): 129, ProcessCoord(pipe=4, data=0, model=2): 130, ProcessCoord(pipe=4, data=0, model=3): 131, ProcessCoord(pipe=4, data=1, model=0): 132, ProcessCoord(pipe=4, data=1, model=1): 133, ProcessCoord(pipe=4, data=1, model=2): 134, ProcessCoord(pipe=4, data=1, model=3): 135, ProcessCoord(pipe=4, data=2, model=0): 136, ProcessCoord(pipe=4, data=2, model=1): 137, ProcessCoord(pipe=4, data=2, model=2): 138, ProcessCoord(pipe=4, data=2, model=3): 139, ProcessCoord(pipe=4, data=3, model=0): 140, ProcessCoord(pipe=4, data=3, model=1): 141, ProcessCoord(pipe=4, data=3, model=2): 142, ProcessCoord(pipe=4, data=3, model=3): 143, ProcessCoord(pipe=4, data=4, model=0): 144, ProcessCoord(pipe=4, data=4, model=1): 145, ProcessCoord(pipe=4, data=4, model=2): 146, ProcessCoord(pipe=4, data=4, model=3): 147, ProcessCoord(pipe=4, data=5, model=0): 148, ProcessCoord(pipe=4, data=5, model=1): 149, ProcessCoord(pipe=4, data=5, model=2): 150, ProcessCoord(pipe=4, data=5, model=3): 151, ProcessCoord(pipe=4, data=6, model=0): 152, ProcessCoord(pipe=4, data=6, model=1): 153, ProcessCoord(pipe=4, data=6, model=2): 154, ProcessCoord(pipe=4, data=6, model=3): 155, ProcessCoord(pipe=4, data=7, model=0): 156, ProcessCoord(pipe=4, data=7, model=1): 157, ProcessCoord(pipe=4, data=7, model=2): 158, ProcessCoord(pipe=4, data=7, model=3): 159, ProcessCoord(pipe=5, data=0, model=0): 160, ProcessCoord(pipe=5, data=0, model=1): 161, ProcessCoord(pipe=5, data=0, model=2): 162, ProcessCoord(pipe=5, data=0, model=3): 163, ProcessCoord(pipe=5, data=1, model=0): 164, ProcessCoord(pipe=5, data=1, model=1): 165, ProcessCoord(pipe=5, data=1, model=2): 166, ProcessCoord(pipe=5, data=1, model=3): 167, ProcessCoord(pipe=5, data=2, model=0): 168, ProcessCoord(pipe=5, data=2, model=1): 169, ProcessCoord(pipe=5, data=2, model=2): 170, ProcessCoord(pipe=5, data=2, model=3): 171, ProcessCoord(pipe=5, data=3, model=0): 172, ProcessCoord(pipe=5, data=3, model=1): 173, ProcessCoord(pipe=5, data=3, model=2): 174, ProcessCoord(pipe=5, data=3, model=3): 175, ProcessCoord(pipe=5, data=4, model=0): 176, ProcessCoord(pipe=5, data=4, model=1): 177, ProcessCoord(pipe=5, data=4, model=2): 178, ProcessCoord(pipe=5, data=4, model=3): 179, ProcessCoord(pipe=5, data=5, model=0): 180, ProcessCoord(pipe=5, data=5, model=1): 181, ProcessCoord(pipe=5, data=5, model=2): 182, ProcessCoord(pipe=5, data=5, model=3): 183, ProcessCoord(pipe=5, data=6, model=0): 184, ProcessCoord(pipe=5, data=6, model=1): 185, ProcessCoord(pipe=5, data=6, model=2): 186, ProcessCoord(pipe=5, data=6, model=3): 187, ProcessCoord(pipe=5, data=7, model=0): 188, ProcessCoord(pipe=5, data=7, model=1): 189, ProcessCoord(pipe=5, data=7, model=2): 190, ProcessCoord(pipe=5, data=7, model=3): 191, ProcessCoord(pipe=6, data=0, model=0): 192, ProcessCoord(pipe=6, data=0, model=1): 193, ProcessCoord(pipe=6, data=0, model=2): 194, ProcessCoord(pipe=6, data=0, model=3): 195, ProcessCoord(pipe=6, data=1, model=0): 196, ProcessCoord(pipe=6, data=1, model=1): 197, ProcessCoord(pipe=6, data=1, model=2): 198, ProcessCoord(pipe=6, data=1, model=3): 199, ProcessCoord(pipe=6, data=2, model=0): 200, ProcessCoord(pipe=6, data=2, model=1): 201, ProcessCoord(pipe=6, data=2, model=2): 202, ProcessCoord(pipe=6, data=2, model=3): 203, ProcessCoord(pipe=6, data=3, model=0): 204, ProcessCoord(pipe=6, data=3, model=1): 205, ProcessCoord(pipe=6, data=3, model=2): 206, ProcessCoord(pipe=6, data=3, model=3): 207, ProcessCoord(pipe=6, data=4, model=0): 208, ProcessCoord(pipe=6, data=4, model=1): 209, ProcessCoord(pipe=6, data=4, model=2): 210, ProcessCoord(pipe=6, data=4, model=3): 211, ProcessCoord(pipe=6, data=5, model=0): 212, ProcessCoord(pipe=6, data=5, model=1): 213, ProcessCoord(pipe=6, data=5, model=2): 214, ProcessCoord(pipe=6, data=5, model=3): 215, ProcessCoord(pipe=6, data=6, model=0): 216, ProcessCoord(pipe=6, data=6, model=1): 217, ProcessCoord(pipe=6, data=6, model=2): 218, ProcessCoord(pipe=6, data=6, model=3): 219, ProcessCoord(pipe=6, data=7, model=0): 220, ProcessCoord(pipe=6, data=7, model=1): 221, ProcessCoord(pipe=6, data=7, model=2): 222, ProcessCoord(pipe=6, data=7, model=3): 223, ProcessCoord(pipe=7, data=0, model=0): 224, ProcessCoord(pipe=7, data=0, model=1): 225, ProcessCoord(pipe=7, data=0, model=2): 226, ProcessCoord(pipe=7, data=0, model=3): 227, ProcessCoord(pipe=7, data=1, model=0): 228, ProcessCoord(pipe=7, data=1, model=1): 229, ProcessCoord(pipe=7, data=1, model=2): 230, ProcessCoord(pipe=7, data=1, model=3): 231, ProcessCoord(pipe=7, data=2, model=0): 232, ProcessCoord(pipe=7, data=2, model=1): 233, ProcessCoord(pipe=7, data=2, model=2): 234, ProcessCoord(pipe=7, data=2, model=3): 235, ProcessCoord(pipe=7, data=3, model=0): 236, ProcessCoord(pipe=7, data=3, model=1): 237, ProcessCoord(pipe=7, data=3, model=2): 238, ProcessCoord(pipe=7, data=3, model=3): 239, ProcessCoord(pipe=7, data=4, model=0): 240, ProcessCoord(pipe=7, data=4, model=1): 241, ProcessCoord(pipe=7, data=4, model=2): 242, ProcessCoord(pipe=7, data=4, model=3): 243, ProcessCoord(pipe=7, data=5, model=0): 244, ProcessCoord(pipe=7, data=5, model=1): 245, ProcessCoord(pipe=7, data=5, model=2): 246, ProcessCoord(pipe=7, data=5, model=3): 247, ProcessCoord(pipe=7, data=6, model=0): 248, ProcessCoord(pipe=7, data=6, model=1): 249, ProcessCoord(pipe=7, data=6, model=2): 250, ProcessCoord(pipe=7, data=6, model=3): 251, ProcessCoord(pipe=7, data=7, model=0): 252, ProcessCoord(pipe=7, data=7, model=1): 253, ProcessCoord(pipe=7, data=7, model=2): 254, ProcessCoord(pipe=7, data=7, model=3): 255} -[2021-09-25 02:36:08,503] [INFO] [module.py:360:_partition_layers] Partitioning pipeline stages with method type:transformer -stage=0 layers=7 - 0: _to_float16 - 1: EmbeddingPipe - 2: - 3: ParallelTransformerLayerPipe - 4: ParallelTransformerLayerPipe - 5: ParallelTransformerLayerPipe - 6: ParallelTransformerLayerPipe -stage=1 layers=4 - 7: ParallelTransformerLayerPipe - 8: ParallelTransformerLayerPipe - 9: ParallelTransformerLayerPipe - 10: ParallelTransformerLayerPipe -stage=2 layers=4 - 11: ParallelTransformerLayerPipe - 12: ParallelTransformerLayerPipe - 13: ParallelTransformerLayerPipe - 14: ParallelTransformerLayerPipe -stage=3 layers=4 - 15: ParallelTransformerLayerPipe - 16: ParallelTransformerLayerPipe - 17: ParallelTransformerLayerPipe - 18: ParallelTransformerLayerPipe -stage=4 layers=4 - 19: ParallelTransformerLayerPipe - 20: ParallelTransformerLayerPipe - 21: ParallelTransformerLayerPipe - 22: ParallelTransformerLayerPipe -stage=5 layers=4 - 23: ParallelTransformerLayerPipe - 24: ParallelTransformerLayerPipe - 25: ParallelTransformerLayerPipe - 26: ParallelTransformerLayerPipe -stage=6 layers=4 - 27: ParallelTransformerLayerPipe - 28: ParallelTransformerLayerPipe - 29: ParallelTransformerLayerPipe - 30: ParallelTransformerLayerPipe -stage=7 layers=8 - 31: ParallelTransformerLayerPipe - 32: ParallelTransformerLayerPipe - 33: ParallelTransformerLayerPipe - 34: ParallelTransformerLayerPipe - 35: - 36: MixedFusedLayerNorm - 37: EmbeddingPipe - 38: float16_to_fp32 - loss: CrossEntropy - > number of parameters on (tensor, pipeline) model parallel rank (2, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (1, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (2, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (2, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (0, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (3, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 7): 1986498560 -[2021-09-25 02:36:09,735] [INFO] [utils.py:680:see_memory_usage] After Building Model -[2021-09-25 02:36:09,736] [INFO] [utils.py:681:see_memory_usage] MA 3.77 GB Max_MA 3.79 GB CA 3.79 GB Max_CA 4 GB -[2021-09-25 02:36:09,737] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 36.85 GB, percent = 19.7% - > number of parameters on (tensor, pipeline) model parallel rank (0, 0): 1986465792 -setting training iterations to 159576 -> learning rate decay style: cosine -DeepSpeed is enabled. -[2021-09-25 02:36:09,793] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.4.2+bc17042, git-hash=bc17042, git-branch=big-science -[2021-09-25 02:36:09,889] [INFO] [engine.py:179:__init__] DeepSpeed Flops Profiler Enabled: False -[2021-09-25 02:36:09,889] [INFO] [engine.py:736:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer -[2021-09-25 02:36:09,890] [INFO] [engine.py:741:_configure_optimizer] Using client Optimizer as basic optimizer -[2021-09-25 02:36:09,890] [INFO] [engine.py:750:_configure_optimizer] DeepSpeed Basic Optimizer = FusedAdam -[2021-09-25 02:36:09,890] [INFO] [utils.py:43:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= -[2021-09-25 02:36:09,890] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 1 optimizer -[2021-09-25 02:36:09,890] [INFO] [stage2.py:106:__init__] Reduce bucket size 500000000 -[2021-09-25 02:36:09,890] [INFO] [stage2.py:107:__init__] Allgather bucket size 500000000 -[2021-09-25 02:36:09,890] [INFO] [stage2.py:108:__init__] CPU Offload: False -[2021-09-25 02:36:09,890] [INFO] [stage2.py:109:__init__] Round robin gradient partitioning: False -[2021-09-25 02:36:14,495] [INFO] [stage2.py:419:__init__] optimizer state initialized -[2021-09-25 02:36:14,495] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam -[2021-09-25 02:36:14,495] [INFO] [engine.py:553:_configure_lr_scheduler] DeepSpeed using client LR scheduler -[2021-09-25 02:36:14,495] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = -[2021-09-25 02:36:14,495] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999)] -[2021-09-25 02:36:14,495] [INFO] [config.py:900:print] DeepSpeedEngine configuration: -[2021-09-25 02:36:14,495] [INFO] [config.py:904:print] activation_checkpointing_config { - "partition_activations": false, - "contiguous_memory_optimization": false, - "cpu_checkpointing": false, - "number_checkpoints": null, - "synchronize_checkpoint_boundary": false, - "profile": false -} -[2021-09-25 02:36:14,495] [INFO] [config.py:904:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} -[2021-09-25 02:36:14,495] [INFO] [config.py:904:print] allreduce_always_fp32 ........ False -[2021-09-25 02:36:14,495] [INFO] [config.py:904:print] amp_enabled .................. False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] amp_params ................... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] checkpoint_tag_validation_enabled True -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] checkpoint_tag_validation_fail False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] disable_allgather ............ False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] dump_state ................... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 500, 'delayed_shift': 2, 'min_scale': 1} -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] eigenvalue_enabled ........... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] eigenvalue_gas_boundary_resolution 1 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] eigenvalue_layer_name ........ bert.encoder.layer -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] eigenvalue_layer_num ......... 0 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] eigenvalue_max_iter .......... 100 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] eigenvalue_stability ......... 1e-06 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] eigenvalue_tol ............... 0.01 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] eigenvalue_verbose ........... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] elasticity_enabled ........... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] flops_profiler_config ........ { - "enabled": false, - "profile_step": 1, - "module_depth": -1, - "top_modules": 1, - "detailed": true, - "output_file": null -} -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] fp16_enabled ................. True -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] fp16_mixed_quantize .......... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] global_rank .................. 0 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] gradient_accumulation_steps .. 256 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] gradient_clipping ............ 1.0 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] gradient_predivide_factor .... 1.0 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] initial_dynamic_scale ........ 4096 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] loss_scale ................... 0 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] memory_breakdown ............. False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] optimizer_legacy_fusion ...... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] optimizer_name ............... None -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] optimizer_params ............. None -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] pld_enabled .................. False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] pld_params ................... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] prescale_gradients ........... False -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] quantize_change_rate ......... 0.001 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] quantize_groups .............. 1 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] quantize_offset .............. 1000 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] quantize_period .............. 1000 -[2021-09-25 02:36:14,496] [INFO] [config.py:904:print] quantize_rounding ............ 0 -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] quantize_start_bits .......... 16 -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] quantize_target_bits ......... 8 -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] quantize_training_enabled .... False -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] quantize_type ................ 0 -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] quantize_verbose ............. False -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] scheduler_name ............... None -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] scheduler_params ............. None -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] sparse_attention ............. None -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] sparse_gradients_enabled ..... False -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] steps_per_print .............. 2000 -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] tensorboard_enabled .......... False -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] tensorboard_job_name ......... DeepSpeedJobName -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] tensorboard_output_path ...... -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] train_batch_size ............. 2048 -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] train_micro_batch_size_per_gpu 1 -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] use_quantizer_kernel ......... False -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] wall_clock_breakdown ......... False -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] world_size ................... 8 -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] zero_allow_untested_optimizer False -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] zero_config .................. { - "stage": 1, - "contiguous_gradients": false, - "reduce_scatter": true, - "reduce_bucket_size": 5.000000e+08, - "allgather_partitions": true, - "allgather_bucket_size": 5.000000e+08, - "overlap_comm": false, - "load_from_fp32_weights": true, - "elastic_checkpoint": true, - "offload_param": null, - "offload_optimizer": null, - "sub_group_size": 1.000000e+09, - "prefetch_bucket_size": 5.000000e+07, - "param_persistence_threshold": 1.000000e+05, - "max_live_parameters": 1.000000e+09, - "max_reuse_distance": 1.000000e+09, - "gather_fp16_weights_on_model_save": false, - "ignore_unused_parameters": true, - "round_robin_gradients": false, - "legacy_stage1": false -} -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] zero_enabled ................. True -[2021-09-25 02:36:14,497] [INFO] [config.py:904:print] zero_optimization_stage ...... 1 -[2021-09-25 02:36:14,497] [INFO] [config.py:906:print] json = { - "train_micro_batch_size_per_gpu": 1, - "train_batch_size": 2.048000e+03, - "gradient_clipping": 1.0, - "zero_optimization": { - "stage": 1 - }, - "fp16": { - "enabled": true, - "loss_scale": 0, - "loss_scale_window": 500, - "hysteresis": 2, - "min_loss_scale": 1, - "initial_scale_power": 12 - }, - "steps_per_print": 2.000000e+03, - "wall_clock_breakdown": false -} -[2021-09-25 02:36:14,497] [INFO] [engine.py:76:__init__] CONFIG: micro_batches=256 micro_batch_size=1 -[2021-09-25 02:36:15,038] [INFO] [engine.py:134:__init__] RANK=0 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,038] [INFO] [engine.py:134:__init__] RANK=1 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,038] [INFO] [engine.py:134:__init__] RANK=2 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,038] [INFO] [engine.py:134:__init__] RANK=3 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=67 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=64 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=65 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=66 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=193 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=194 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=195 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=129 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=130 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=128 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=227 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=224 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=225 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=226 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=98 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=160 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=163 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=161 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=162 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=192 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=131 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=34 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=32 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=35 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=33 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=97 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=99 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 02:36:15,039] [INFO] [engine.py:134:__init__] RANK=96 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) - > using checkpoint value 6e-05 for learning rate - > using checkpoint value 6e-06 for minimum learning rate - > using checkpoint value 216320 for warmup iterations - > using checkpoint value 126953125 for total number of iterations - > using checkpoint value cosine for decay style -successfully loaded 8 ZeRO state_dicts for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 167 -successfully loaded 8 ZeRO state_dicts for rank 183 -successfully loaded 8 ZeRO state_dicts for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 56 -successfully loaded 8 ZeRO state_dicts for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 222 -successfully loaded 8 ZeRO state_dicts for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 177 -successfully loaded 8 ZeRO state_dicts for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 164 -successfully loaded 8 ZeRO state_dicts for rank 176 -successfully loaded 8 ZeRO state_dicts for rank 110 -successfully loaded 8 ZeRO state_dicts for rank 58 -successfully loaded 8 ZeRO state_dicts for rank 178 -successfully loaded 8 ZeRO state_dicts for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 116 -successfully loaded 8 ZeRO state_dicts for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 188 -successfully loaded 8 ZeRO state_dicts for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 182 -successfully loaded 8 ZeRO state_dicts for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 170 -successfully loaded 8 ZeRO state_dicts for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 109 -successfully loaded 8 ZeRO state_dicts for rank 44 -successfully loaded 8 ZeRO state_dicts for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 59 -successfully loaded 8 ZeRO state_dicts for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 200 -successfully loaded 8 ZeRO state_dicts for rank 185 -successfully loaded 8 ZeRO state_dicts for rank 15 -successfully loaded 8 ZeRO state_dicts for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 171 -successfully loaded 8 ZeRO state_dicts for rank 169 -successfully loaded 8 ZeRO state_dicts for rank 20 -successfully loaded 8 ZeRO state_dicts for rank 198 -successfully loaded 8 ZeRO state_dicts for rank 161 -successfully loaded 8 ZeRO state_dicts for rank 57 -successfully loaded 8 ZeRO state_dicts for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 81 -successfully loaded 8 ZeRO state_dicts for rank 111 -successfully loaded 8 ZeRO state_dicts for rank 120 -successfully loaded 8 ZeRO state_dicts for rank 211 -successfully loaded 8 ZeRO state_dicts for rank 221 -successfully loaded 8 ZeRO state_dicts for rank 16 -successfully loaded 8 ZeRO state_dicts for rank 186 -successfully loaded 8 ZeRO state_dicts for rank 223 -successfully loaded 8 ZeRO state_dicts for rank 93 -successfully loaded 8 ZeRO state_dicts for rank 95 -successfully loaded 8 ZeRO state_dicts for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 21 -successfully loaded 8 ZeRO state_dicts for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 107 -successfully loaded 8 ZeRO state_dicts for rank 194 -successfully loaded 8 ZeRO state_dicts for rank 142 -successfully loaded 8 ZeRO state_dicts for rank 51 -successfully loaded 8 ZeRO state_dicts for rank 209 -successfully loaded 8 ZeRO state_dicts for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 160 -successfully loaded 8 ZeRO state_dicts for rank 83 -successfully loaded 8 ZeRO state_dicts for rank 97 -successfully loaded 8 ZeRO state_dicts for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 100 -successfully loaded 8 ZeRO state_dicts for rank 174 -successfully loaded 8 ZeRO state_dicts for rank 23 -successfully loaded 8 ZeRO state_dicts for rank 121 -successfully loaded 8 ZeRO state_dicts for rank 80 -successfully loaded 8 ZeRO state_dicts for rank 75 -successfully loaded 8 ZeRO state_dicts for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 205 -loading 8 zero partition checkpoints for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 190 -successfully loaded 8 ZeRO state_dicts for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 202 -successfully loaded 8 ZeRO state_dicts for rank 196 -loading 8 zero partition checkpoints for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 165 -loading 8 zero partition checkpoints for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 179 -successfully loaded 8 ZeRO state_dicts for rank 175 -successfully loaded 8 ZeRO state_dicts for rank 187 -successfully loaded 8 ZeRO state_dicts for rank 126 -successfully loaded 8 ZeRO state_dicts for rank 13 -successfully loaded 8 ZeRO state_dicts for rank 36 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 02:36:41 CEST)" was missed by 0:00:03.258297 -successfully loaded 8 ZeRO state_dicts for rank 199 -successfully loaded 8 ZeRO state_dicts for rank 55 -successfully loaded 8 ZeRO state_dicts for rank 99 -successfully loaded 8 ZeRO state_dicts for rank 115 -successfully loaded 8 ZeRO state_dicts for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 162 -successfully loaded 8 ZeRO state_dicts for rank 203 -successfully loaded 8 ZeRO state_dicts for rank 22 -successfully loaded 8 ZeRO state_dicts for rank 210 -loading 8 zero partition checkpoints for rank 183 -successfully loaded 8 ZeRO state_dicts for rank 82 -successfully loaded 8 ZeRO state_dicts for rank 35 -successfully loaded 8 ZeRO state_dicts for rank 129 -successfully loaded 8 ZeRO state_dicts for rank 131 -successfully loaded 8 ZeRO state_dicts for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 130 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 02:36:42 CEST)" was missed by 0:00:03.400033 -successfully loaded 8 ZeRO state_dicts for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 208 -successfully loaded 8 ZeRO state_dicts for rank 141 -successfully loaded 8 ZeRO state_dicts for rank 181 -successfully loaded 8 ZeRO state_dicts for rank 92 -loading 8 zero partition checkpoints for rank 167 -successfully loaded 8 ZeRO state_dicts for rank 12 -successfully loaded 8 ZeRO state_dicts for rank 18 -successfully loaded 8 ZeRO state_dicts for rank 118 -successfully loaded 8 ZeRO state_dicts for rank 19 -successfully loaded 8 ZeRO state_dicts for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 173 -successfully loaded 8 ZeRO state_dicts for rank 236 -successfully loaded 8 ZeRO state_dicts for rank 224 -successfully loaded 8 ZeRO state_dicts for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 195 -successfully loaded 8 ZeRO state_dicts for rank 69 -successfully loaded 8 ZeRO state_dicts for rank 65 -successfully loaded 8 ZeRO state_dicts for rank 41 -successfully loaded 8 ZeRO state_dicts for rank 189 -successfully loaded 8 ZeRO state_dicts for rank 71 -successfully loaded 8 ZeRO state_dicts for rank 79 -successfully loaded 8 ZeRO state_dicts for rank 87 -successfully loaded 8 ZeRO state_dicts for rank 138 -successfully loaded 8 ZeRO state_dicts for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 197 -successfully loaded 8 ZeRO state_dicts for rank 8 -successfully loaded 8 ZeRO state_dicts for rank 134 -successfully loaded 8 ZeRO state_dicts for rank 14 -successfully loaded 8 ZeRO state_dicts for rank 39 -successfully loaded 8 ZeRO state_dicts for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 88 -successfully loaded 8 ZeRO state_dicts for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 91 -successfully loaded 8 ZeRO state_dicts for rank 163 -successfully loaded 8 ZeRO state_dicts for rank 114 -successfully loaded 8 ZeRO state_dicts for rank 237 -successfully loaded 8 ZeRO state_dicts for rank 45 -successfully loaded 8 ZeRO state_dicts for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 106 -loading 8 zero partition checkpoints for rank 56 -successfully loaded 8 ZeRO state_dicts for rank 218 -successfully loaded 8 ZeRO state_dicts for rank 243 -successfully loaded 8 ZeRO state_dicts for rank 25 -successfully loaded 8 ZeRO state_dicts for rank 98 -successfully loaded 8 ZeRO state_dicts for rank 245 -loading 8 zero partition checkpoints for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 240 -successfully loaded 8 ZeRO state_dicts for rank 213 -loading 8 zero partition checkpoints for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 103 -successfully loaded 8 ZeRO state_dicts for rank 0 -successfully loaded 8 ZeRO state_dicts for rank 191 -successfully loaded 8 ZeRO state_dicts for rank 149 -successfully loaded 8 ZeRO state_dicts for rank 252 -successfully loaded 8 ZeRO state_dicts for rank 67 -successfully loaded 8 ZeRO state_dicts for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 49 -successfully loaded 8 ZeRO state_dicts for rank 54 -successfully loaded 8 ZeRO state_dicts for rank 119 -successfully loaded 8 ZeRO state_dicts for rank 77 -successfully loaded 8 ZeRO state_dicts for rank 73 -loading 8 zero partition checkpoints for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 238 -successfully loaded 8 ZeRO state_dicts for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 94 -successfully loaded 8 ZeRO state_dicts for rank 147 -successfully loaded 8 ZeRO state_dicts for rank 3 -successfully loaded 8 ZeRO state_dicts for rank 27 -successfully loaded 8 ZeRO state_dicts for rank 233 -successfully loaded 8 ZeRO state_dicts for rank 84 -successfully loaded 8 ZeRO state_dicts for rank 17 -successfully loaded 8 ZeRO state_dicts for rank 117 -successfully loaded 8 ZeRO state_dicts for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 133 -successfully loaded 8 ZeRO state_dicts for rank 11 -successfully loaded 8 ZeRO state_dicts for rank 101 -successfully loaded 8 ZeRO state_dicts for rank 248 -successfully loaded 8 ZeRO state_dicts for rank 90 -successfully loaded 8 ZeRO state_dicts for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 46 -successfully loaded 8 ZeRO state_dicts for rank 47 -successfully loaded 8 ZeRO state_dicts for rank 78 -successfully loaded 8 ZeRO state_dicts for rank 242 -loading 8 zero partition checkpoints for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 239 -successfully loaded 8 ZeRO state_dicts for rank 9 -successfully loaded 8 ZeRO state_dicts for rank 74 -successfully loaded 8 ZeRO state_dicts for rank 225 -successfully loaded 8 ZeRO state_dicts for rank 68 -successfully loaded 8 ZeRO state_dicts for rank 146 -loading 8 zero partition checkpoints for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 122 -successfully loaded 8 ZeRO state_dicts for rank 2 -successfully loaded 8 ZeRO state_dicts for rank 123 -successfully loaded 8 ZeRO state_dicts for rank 37 -successfully loaded 8 ZeRO state_dicts for rank 53 -loading 8 zero partition checkpoints for rank 176 -successfully loaded 8 ZeRO state_dicts for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 28 -successfully loaded 8 ZeRO state_dicts for rank 86 -successfully loaded 8 ZeRO state_dicts for rank 234 -successfully loaded 8 ZeRO state_dicts for rank 244 -successfully loaded 8 ZeRO state_dicts for rank 226 -loading 8 zero partition checkpoints for rank 110 -successfully loaded 8 ZeRO state_dicts for rank 145 -successfully loaded 8 ZeRO state_dicts for rank 228 -successfully loaded 8 ZeRO state_dicts for rank 217 -successfully loaded 8 ZeRO state_dicts for rank 216 -successfully loaded 8 ZeRO state_dicts for rank 152 -loading 8 zero partition checkpoints for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 227 -successfully loaded 8 ZeRO state_dicts for rank 85 -successfully loaded 8 ZeRO state_dicts for rank 154 -loading 8 zero partition checkpoints for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 24 -successfully loaded 8 ZeRO state_dicts for rank 241 -loading 8 zero partition checkpoints for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 1 -successfully loaded 8 ZeRO state_dicts for rank 64 -successfully loaded 8 ZeRO state_dicts for rank 50 -successfully loaded 8 ZeRO state_dicts for rank 42 -successfully loaded 8 ZeRO state_dicts for rank 232 -loading 8 zero partition checkpoints for rank 116 -loading 8 zero partition checkpoints for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 33 -successfully loaded 8 ZeRO state_dicts for rank 10 -successfully loaded 8 ZeRO state_dicts for rank 31 -successfully loaded 8 ZeRO state_dicts for rank 38 -loading 8 zero partition checkpoints for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 89 -successfully loaded 8 ZeRO state_dicts for rank 249 -successfully loaded 8 ZeRO state_dicts for rank 246 -loading 8 zero partition checkpoints for rank 58 -successfully loaded 8 ZeRO state_dicts for rank 151 -loading 8 zero partition checkpoints for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 34 -successfully loaded 8 ZeRO state_dicts for rank 250 -successfully loaded 8 ZeRO state_dicts for rank 102 -successfully loaded 8 ZeRO state_dicts for rank 230 -successfully loaded 8 ZeRO state_dicts for rank 70 -successfully loaded 8 ZeRO state_dicts for rank 26 -successfully loaded 8 ZeRO state_dicts for rank 29 -loading 8 zero partition checkpoints for rank 62 -loading 8 zero partition checkpoints for rank 182 -loading 8 zero partition checkpoints for rank 124 -loading 8 zero partition checkpoints for rank 109 -successfully loaded 8 ZeRO state_dicts for rank 247 -successfully loaded 8 ZeRO state_dicts for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 30 -successfully loaded 8 ZeRO state_dicts for rank 153 -loading 8 zero partition checkpoints for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 251 -successfully loaded 8 ZeRO state_dicts for rank 43 -successfully loaded 8 ZeRO state_dicts for rank 235 -loading 8 zero partition checkpoints for rank 177 -loading 8 zero partition checkpoints for rank 200 -loading 8 zero partition checkpoints for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 254 -successfully loaded 8 ZeRO state_dicts for rank 229 -loading 8 zero partition checkpoints for rank 164 -loading 8 zero partition checkpoints for rank 44 -loading 8 zero partition checkpoints for rank 211 -loading 8 zero partition checkpoints for rank 111 -loading 8 zero partition checkpoints for rank 221 -loading 8 zero partition checkpoints for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 66 -loading 8 zero partition checkpoints for rank 188 -loading 8 zero partition checkpoints for rank 194 -loading 8 zero partition checkpoints for rank 81 -loading 8 zero partition checkpoints for rank 15 -successfully loaded 8 ZeRO state_dicts for rank 231 -loading 8 zero partition checkpoints for rank 207 -loading 8 zero partition checkpoints for rank 107 -loading 8 zero partition checkpoints for rank 160 -successfully loaded 8 ZeRO state_dicts for rank 253 -loading 8 zero partition checkpoints for rank 105 -loading 8 zero partition checkpoints for rank 186 -loading 8 zero partition checkpoints for rank 223 -loading 8 zero partition checkpoints for rank 95 -loading 8 zero partition checkpoints for rank 174 -loading 8 zero partition checkpoints for rank 51 -loading 8 zero partition checkpoints for rank 61 -loading 8 zero partition checkpoints for rank 120 -loading 8 zero partition checkpoints for rank 135 -loading 8 zero partition checkpoints for rank 97 -loading 8 zero partition checkpoints for rank 140 -loading 8 zero partition checkpoints for rank 16 -loading 8 zero partition checkpoints for rank 198 -loading 8 zero partition checkpoints for rank 100 -loading 8 zero partition checkpoints for rank 171 -loading 8 zero partition checkpoints for rank 205 -loading 8 zero partition checkpoints for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 219 -successfully loaded 8 ZeRO state_dicts for rank 255 -loading 8 zero partition checkpoints for rank 20 -loading 8 zero partition checkpoints for rank 126 -loading 8 zero partition checkpoints for rank 55 -loading 8 zero partition checkpoints for rank 175 -loading 8 zero partition checkpoints for rank 99 -loading 8 zero partition checkpoints for rank 36 -loading 8 zero partition checkpoints for rank 199 -loading 8 zero partition checkpoints for rank 166 -loading 8 zero partition checkpoints for rank 158 -loading 8 zero partition checkpoints for rank 157 -loading 8 zero partition checkpoints for rank 82 -loading 8 zero partition checkpoints for rank 129 -loading 8 zero partition checkpoints for rank 222 -loading 8 zero partition checkpoints for rank 215 -loading 8 zero partition checkpoints for rank 121 -loading 8 zero partition checkpoints for rank 115 -loading 8 zero partition checkpoints for rank 181 -loading 8 zero partition checkpoints for rank 134 -loading 8 zero partition checkpoints for rank 21 -loading 8 zero partition checkpoints for rank 87 -loading 8 zero partition checkpoints for rank 201 -loading 8 zero partition checkpoints for rank 197 -loading 8 zero partition checkpoints for rank 13 -loading 8 zero partition checkpoints for rank 173 -loading 8 zero partition checkpoints for rank 132 -loading 8 zero partition checkpoints for rank 195 -loading 8 zero partition checkpoints for rank 178 -loading 8 zero partition checkpoints for rank 69 -loading 8 zero partition checkpoints for rank 65 -loading 8 zero partition checkpoints for rank 125 -loading 8 zero partition checkpoints for rank 138 -loading 8 zero partition checkpoints for rank 208 -loading 8 zero partition checkpoints for rank 45 -loading 8 zero partition checkpoints for rank 39 -loading 8 zero partition checkpoints for rank 196 -loading 8 zero partition checkpoints for rank 130 -loading 8 zero partition checkpoints for rank 35 -loading 8 zero partition checkpoints for rank 165 -loading 8 zero partition checkpoints for rank 209 -loading 8 zero partition checkpoints for rank 213 -loading 8 zero partition checkpoints for rank 190 -loading 8 zero partition checkpoints for rank 189 -loading 8 zero partition checkpoints for rank 114 -loading 8 zero partition checkpoints for rank 12 -loading 8 zero partition checkpoints for rank 54 -loading 8 zero partition checkpoints for rank 98 -loading 8 zero partition checkpoints for rank 49 -loading 8 zero partition checkpoints for rank 142 -loading 8 zero partition checkpoints for rank 136 -loading 8 zero partition checkpoints for rank 19 -loading 8 zero partition checkpoints for rank 163 -loading 8 zero partition checkpoints for rank 159 -loading 8 zero partition checkpoints for rank 94 -loading 8 zero partition checkpoints for rank 88 -loading 8 zero partition checkpoints for rank 67 -loading 8 zero partition checkpoints for rank 106 -loading 8 zero partition checkpoints for rank 149 -loading 8 zero partition checkpoints for rank 73 -loading 8 zero partition checkpoints for rank 218 -loading 8 zero partition checkpoints for rank 59 -loading 8 zero partition checkpoints for rank 139 -loading 8 zero partition checkpoints for rank 137 -loading 8 zero partition checkpoints for rank 212 -loading 8 zero partition checkpoints for rank 14 -loading 8 zero partition checkpoints for rank 144 -loading 8 zero partition checkpoints for rank 57 -loading 8 zero partition checkpoints for rank 191 -loading 8 zero partition checkpoints for rank 23 -loading 8 zero partition checkpoints for rank 133 -loading 8 zero partition checkpoints for rank 117 -loading 8 zero partition checkpoints for rank 220 -loading 8 zero partition checkpoints for rank 40 -loading 8 zero partition checkpoints for rank 122 -loading 8 zero partition checkpoints for rank 179 -loading 8 zero partition checkpoints for rank 78 -loading 8 zero partition checkpoints for rank 83 -loading 8 zero partition checkpoints for rank 150 -loading 8 zero partition checkpoints for rank 156 -loading 8 zero partition checkpoints for rank 245 -loading 8 zero partition checkpoints for rank 141 -loading 8 zero partition checkpoints for rank 210 -loading 8 zero partition checkpoints for rank 123 -loading 8 zero partition checkpoints for rank 37 -loading 8 zero partition checkpoints for rank 243 -loading 8 zero partition checkpoints for rank 74 -loading 8 zero partition checkpoints for rank 68 -loading 8 zero partition checkpoints for rank 217 -loading 8 zero partition checkpoints for rank 185 -loading 8 zero partition checkpoints for rank 237 -loading 8 zero partition checkpoints for rank 192 -loading 8 zero partition checkpoints for rank 161 -loading 8 zero partition checkpoints for rank 90 -loading 8 zero partition checkpoints for rank 80 -loading 8 zero partition checkpoints for rank 3 -loading 8 zero partition checkpoints for rank 93 -loading 8 zero partition checkpoints for rank 53 -loading 8 zero partition checkpoints for rank 101 -loading 8 zero partition checkpoints for rank 118 -loading 8 zero partition checkpoints for rank 71 -loading 8 zero partition checkpoints for rank 33 -loading 8 zero partition checkpoints for rank 9 -loading 8 zero partition checkpoints for rank 239 -loading 8 zero partition checkpoints for rank 48 -loading 8 zero partition checkpoints for rank 86 -loading 8 zero partition checkpoints for rank 187 -loading 8 zero partition checkpoints for rank 64 -loading 8 zero partition checkpoints for rank 170 -loading 8 zero partition checkpoints for rank 11 -loading 8 zero partition checkpoints for rank 145 -loading 8 zero partition checkpoints for rank 38 -loading 8 zero partition checkpoints for rank 152 -loading 8 zero partition checkpoints for rank 22 -loading 8 zero partition checkpoints for rank 155 -loading 8 zero partition checkpoints for rank 2 -loading 8 zero partition checkpoints for rank 226 -loading 8 zero partition checkpoints for rank 244 -loading 8 zero partition checkpoints for rank 75 -loading 8 zero partition checkpoints for rank 79 -loading 8 zero partition checkpoints for rank 84 -loading 8 zero partition checkpoints for rank 193 -loading 8 zero partition checkpoints for rank 227 -loading 8 zero partition checkpoints for rank 46 -loading 8 zero partition checkpoints for rank 47 -loading 8 zero partition checkpoints for rank 119 -loading 8 zero partition checkpoints for rank 234 -loading 8 zero partition checkpoints for rank 24 -loading 8 zero partition checkpoints for rank 92 -loading 8 zero partition checkpoints for rank 169 -loading 8 zero partition checkpoints for rank 43 -loading 8 zero partition checkpoints for rank 18 -loading 8 zero partition checkpoints for rank 241 -loading 8 zero partition checkpoints for rank 128 -loading 8 zero partition checkpoints for rank 77 -loading 8 zero partition checkpoints for rank 162 -loading 8 zero partition checkpoints for rank 246 -loading 8 zero partition checkpoints for rank 151 -loading 8 zero partition checkpoints for rank 72 -loading 8 zero partition checkpoints for rank 41 -loading 8 zero partition checkpoints for rank 91 -loading 8 zero partition checkpoints for rank 26 -loading 8 zero partition checkpoints for rank 147 -loading 8 zero partition checkpoints for rank 224 -loading 8 zero partition checkpoints for rank 50 -loading 8 zero partition checkpoints for rank 216 -loading 8 zero partition checkpoints for rank 85 -loading 8 zero partition checkpoints for rank 148 -loading 8 zero partition checkpoints for rank 131 -loading 8 zero partition checkpoints for rank 32 -loading 8 zero partition checkpoints for rank 247 -loading 8 zero partition checkpoints for rank 8 -loading 8 zero partition checkpoints for rank 66 -loading 8 zero partition checkpoints for rank 229 -loading 8 zero partition checkpoints for rank 146 -loading 8 zero partition checkpoints for rank 235 -loading 8 zero partition checkpoints for rank 17 -loading 8 zero partition checkpoints for rank 236 -loading 8 zero partition checkpoints for rank 254 -loading 8 zero partition checkpoints for rank 202 -loading 8 zero partition checkpoints for rank 70 -loading 8 zero partition checkpoints for rank 42 -loading 8 zero partition checkpoints for rank 250 -loading 8 zero partition checkpoints for rank 89 -loading 8 zero partition checkpoints for rank 251 -loading 8 zero partition checkpoints for rank 228 -loading 8 zero partition checkpoints for rank 103 -loading 8 zero partition checkpoints for rank 225 -loading 8 zero partition checkpoints for rank 34 -loading 8 zero partition checkpoints for rank 203 -loading 8 zero partition checkpoints for rank 231 -loading 8 zero partition checkpoints for rank 25 -loading 8 zero partition checkpoints for rank 238 -loading 8 zero partition checkpoints for rank 255 -loading 8 zero partition checkpoints for rank 102 -loading 8 zero partition checkpoints for rank 154 -loading 8 zero partition checkpoints for rank 219 -loading 8 zero partition checkpoints for rank 230 -loading 8 zero partition checkpoints for rank 240 -loading 8 zero partition checkpoints for rank 10 -loading 8 zero partition checkpoints for rank 27 -loading 8 zero partition checkpoints for rank 0 - checkpoint version 3.0 -loading 8 zero partition checkpoints for rank 153 -loading 8 zero partition checkpoints for rank 233 -loading 8 zero partition checkpoints for rank 248 -loading 8 zero partition checkpoints for rank 242 -loading 8 zero partition checkpoints for rank 252 -loading 8 zero partition checkpoints for rank 232 -loading 8 zero partition checkpoints for rank 1 -loading 8 zero partition checkpoints for rank 249 -loading 8 zero partition checkpoints for rank 253 -loading 8 zero partition checkpoints for rank 28 -loading 8 zero partition checkpoints for rank 31 -loading 8 zero partition checkpoints for rank 29 -loading 8 zero partition checkpoints for rank 30 -successfully loaded 8 ZeRO state_dicts for rank 5 -loading 8 zero partition checkpoints for rank 5 -successfully loaded 8 ZeRO state_dicts for rank 6 -successfully loaded 8 ZeRO state_dicts for rank 4 -successfully loaded 8 ZeRO state_dicts for rank 7 -loading 8 zero partition checkpoints for rank 6 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 02:38:42 CEST)" was missed by 0:00:03.040753 -loading 8 zero partition checkpoints for rank 4 -loading 8 zero partition checkpoints for rank 7 - successfully loaded checkpoint from /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints at iteration 5827 -time (ms) | load-checkpoint: 94708.03 -[after model, optimizer, and learning rate scheduler are built] datetime: 2021-09-25 02:37:49 -> building train, validation, and test datasets ... - > datasets target sizes (minimum size): - train: 300000000 - validation: 1638400 - test: 10240 -> building train, validation, and test datasets for GPT ... - > building dataset index ... - reading sizes... - reading pointers... - reading document index... - creating numpy buffer of mmap... - creating memory view of numpy buffer... - > finished creating indexed dataset in 0.199121 seconds - number of documents: 304230423 - > dataset split: - train: - document indices in [0, 288714672) total of 288714672 documents - validation: - document indices in [288714672, 303926193) total of 15211521 documents - test: - document indices in [303926193, 304230423) total of 304230 documents - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.460 seconds - total number of samples: 394611670 - total number of epochs: 3 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.335 seconds - total number of samples: 6927161 - total number of epochs: 1 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.163 seconds - total number of samples: 137384 - total number of epochs: 1 -> finished creating GPT datasets ... -[after dataloaders are built] datetime: 2021-09-25 02:37:56 -done with setup ... -training ... -time (ms) | model-and-optimizer-setup: 102787.57 | train/valid/test-data-iterators-setup: 6275.52 -[before the start of training step] datetime: 2021-09-25 02:37:56 -[2021-09-25 02:37:56,930] [INFO] [checkpointing.py:408:forward] Activation Checkpointing Information -[2021-09-25 02:37:56,931] [INFO] [checkpointing.py:409:forward] ----Partition Activations False, CPU CHECKPOINTING False -[2021-09-25 02:37:56,931] [INFO] [checkpointing.py:412:forward] ----contiguous Memory Checkpointing False with 32 total layers -[2021-09-25 02:37:56,931] [INFO] [checkpointing.py:415:forward] ----Synchronization False -[2021-09-25 02:37:56,931] [INFO] [checkpointing.py:416:forward] ----Profiling time in checkpointing False -[Rank 1] (after 5830 iterations) memory (MB) | allocated: 6685.79931640625 | max allocated: 13590.94921875 | reserved: 22862.0 | max reserved: 22862.0 -[Rank 225] (after 5830 iterations) memory (MB) | allocated: 7107.7109375 | max allocated: 11885.68701171875 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 2] (after 5830 iterations) memory (MB) | allocated: 6685.79931640625 | max allocated: 13590.94921875 | reserved: 22862.0 | max reserved: 22862.0 -[Rank 226] (after 5830 iterations) memory (MB) | allocated: 7107.7109375 | max allocated: 11885.6865234375 | reserved: 20752.0 | max reserved: 20752.0 -[Rank 224] (after 5830 iterations) memory (MB) | allocated: 7107.7109375 | max allocated: 11885.6875 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 0] (after 5830 iterations) memory (MB) | allocated: 6685.79931640625 | max allocated: 13590.94921875 | reserved: 23246.0 | max reserved: 23246.0 -[Rank 3] (after 5830 iterations) memory (MB) | allocated: 6685.79931640625 | max allocated: 13590.94921875 | reserved: 22862.0 | max reserved: 22862.0 -[Rank 227] (after 5830 iterations) memory (MB) | allocated: 7107.7109375 | max allocated: 11885.68701171875 | reserved: 22492.0 | max reserved: 22492.0 - iteration 5830/ 159576 | consumed samples: 168368 | elapsed time per iteration (ms): 21875.4 | learning rate: 4.656E-05 | global batch size: 64 | lm loss: 6.454423E+00 | loss scale: 2048.0 | grad norm: 45630.759 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[Rank 65] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 19902.0 | max reserved: 19902.0 -[Rank 33] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 19866.0 | max reserved: 19866.0 -[Rank 97] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19402.0 | max reserved: 19402.0 -[Rank 66] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 19890.0 | max reserved: 19890.0 -[Rank 34] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20370.0 | max reserved: 20370.0 -[Rank 193] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 19066.0 | max reserved: 19066.0 -[Rank 161] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 19146.0 | max reserved: 19146.0 -[Rank 129] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19582.0 | max reserved: 19582.0 -[Rank 162] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 19066.0 | max reserved: 19066.0 -[Rank 130] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19434.0 | max reserved: 19434.0 -[Rank 98] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19674.0 | max reserved: 19674.0 -[Rank 194] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 19066.0 | max reserved: 19066.0 -[Rank 64] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 20536.0 | max reserved: 20536.0 -[Rank 32] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20408.0 | max reserved: 20408.0 -[Rank 99] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19838.0 | max reserved: 19838.0 -[Rank 131] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19502.0 | max reserved: 19502.0 -[Rank 67] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 19902.0 | max reserved: 19902.0 -[Rank 35] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 19866.0 | max reserved: 19866.0 -[Rank 192] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 128] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19908.0 | max reserved: 19908.0 -[Rank 160] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 19636.0 | max reserved: 19636.0 -[Rank 96] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19988.0 | max reserved: 19988.0 -[Rank 163] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 19306.0 | max reserved: 19306.0 -[Rank 195] (after 5830 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 18826.0 | max reserved: 18826.0 - iteration 5840/ 159576 | consumed samples: 169008 | elapsed time per iteration (ms): 16822.3 | learning rate: 4.674E-05 | global batch size: 64 | lm loss: 6.392004E+00 | loss scale: 2048.0 | grad norm: 53106.299 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5850/ 159576 | consumed samples: 169648 | elapsed time per iteration (ms): 16813.6 | learning rate: 4.692E-05 | global batch size: 64 | lm loss: 6.347363E+00 | loss scale: 2048.0 | grad norm: 53512.430 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5860/ 159576 | consumed samples: 170288 | elapsed time per iteration (ms): 16773.5 | learning rate: 4.709E-05 | global batch size: 64 | lm loss: 6.368040E+00 | loss scale: 2048.0 | grad norm: 49687.313 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5870/ 159576 | consumed samples: 170928 | elapsed time per iteration (ms): 16844.9 | learning rate: 4.727E-05 | global batch size: 64 | lm loss: 6.372821E+00 | loss scale: 2048.0 | grad norm: 49107.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5880/ 159576 | consumed samples: 171568 | elapsed time per iteration (ms): 16812.2 | learning rate: 4.745E-05 | global batch size: 64 | lm loss: 6.379050E+00 | loss scale: 2048.0 | grad norm: 76898.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5890/ 159576 | consumed samples: 172208 | elapsed time per iteration (ms): 16819.7 | learning rate: 4.763E-05 | global batch size: 64 | lm loss: 6.333071E+00 | loss scale: 2048.0 | grad norm: 69874.656 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5900/ 159576 | consumed samples: 172848 | elapsed time per iteration (ms): 16821.3 | learning rate: 4.780E-05 | global batch size: 64 | lm loss: 6.354385E+00 | loss scale: 2048.0 | grad norm: 57915.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5910/ 159576 | consumed samples: 173488 | elapsed time per iteration (ms): 16679.9 | learning rate: 4.798E-05 | global batch size: 64 | lm loss: 6.361916E+00 | loss scale: 2048.0 | grad norm: 56535.869 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5920/ 159576 | consumed samples: 174128 | elapsed time per iteration (ms): 16731.8 | learning rate: 4.816E-05 | global batch size: 64 | lm loss: 6.371978E+00 | loss scale: 2048.0 | grad norm: 75613.913 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5930/ 159576 | consumed samples: 174768 | elapsed time per iteration (ms): 16796.3 | learning rate: 4.834E-05 | global batch size: 64 | lm loss: 6.373956E+00 | loss scale: 2048.0 | grad norm: 64436.905 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 03:08:32] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1185639_[1-10%1] on 'gpu_p13' partition) -[2021-09-25 03:08:32] PULSE: tr8-104B is running for 33:04 since 2021-09-25T02:35:28 (1185609 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n[0,7-8]) - iteration 5940/ 159576 | consumed samples: 175408 | elapsed time per iteration (ms): 16680.4 | learning rate: 4.851E-05 | global batch size: 64 | lm loss: 6.367229E+00 | loss scale: 2048.0 | grad norm: 61103.619 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5950/ 159576 | consumed samples: 176048 | elapsed time per iteration (ms): 16548.2 | learning rate: 4.869E-05 | global batch size: 64 | lm loss: 6.365273E+00 | loss scale: 2048.0 | grad norm: 74137.806 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5960/ 159576 | consumed samples: 176688 | elapsed time per iteration (ms): 16720.7 | learning rate: 4.887E-05 | global batch size: 64 | lm loss: 6.339179E+00 | loss scale: 2048.0 | grad norm: 117906.851 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5970/ 159576 | consumed samples: 177328 | elapsed time per iteration (ms): 16666.6 | learning rate: 4.905E-05 | global batch size: 64 | lm loss: 6.366007E+00 | loss scale: 2048.0 | grad norm: 135736.452 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5980/ 159576 | consumed samples: 177968 | elapsed time per iteration (ms): 16712.0 | learning rate: 4.922E-05 | global batch size: 64 | lm loss: 6.311417E+00 | loss scale: 2048.0 | grad norm: 59672.483 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 5990/ 159576 | consumed samples: 178608 | elapsed time per iteration (ms): 16795.9 | learning rate: 4.940E-05 | global batch size: 64 | lm loss: 6.346366E+00 | loss scale: 2048.0 | grad norm: 70394.026 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 03:26:24,359] [INFO] [logging.py:68:log_dist] [Rank 0] step=6000, skipped=13, lr=[4.9579881656804734e-05, 4.9579881656804734e-05], mom=[(0.9, 0.999), (0.9, 0.999)] -steps: 6000 loss: 6.4051 iter time (s): 0.008 samples/sec: 7888.018 - iteration 6000/ 159576 | consumed samples: 179248 | elapsed time per iteration (ms): 16825.1 | learning rate: 4.958E-05 | global batch size: 64 | lm loss: 6.338142E+00 | loss scale: 2048.0 | grad norm: 51469.855 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 6000 | lm loss value: 6.305492E+00 | lm loss PPL: 5.475711E+02 | ------------------------------------------------------------------------------------------------- -saving checkpoint at iteration 6000 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-25 03:26:46,630] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step6000/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 6000 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 18535.85 - iteration 6010/ 159576 | consumed samples: 179888 | elapsed time per iteration (ms): 19605.0 | learning rate: 4.976E-05 | global batch size: 64 | lm loss: 6.332598E+00 | loss scale: 2048.0 | grad norm: 64216.775 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6020/ 159576 | consumed samples: 180528 | elapsed time per iteration (ms): 16682.2 | learning rate: 4.993E-05 | global batch size: 64 | lm loss: 6.346989E+00 | loss scale: 2048.0 | grad norm: 65052.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6030/ 159576 | consumed samples: 181168 | elapsed time per iteration (ms): 16536.1 | learning rate: 5.011E-05 | global batch size: 64 | lm loss: 6.314711E+00 | loss scale: 2048.0 | grad norm: 61186.621 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6040/ 159576 | consumed samples: 181808 | elapsed time per iteration (ms): 16509.4 | learning rate: 5.029E-05 | global batch size: 64 | lm loss: 6.347876E+00 | loss scale: 2048.0 | grad norm: 80684.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6050/ 159576 | consumed samples: 182448 | elapsed time per iteration (ms): 16821.6 | learning rate: 5.047E-05 | global batch size: 64 | lm loss: 6.345741E+00 | loss scale: 2048.0 | grad norm: 207970.428 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6060/ 159576 | consumed samples: 183088 | elapsed time per iteration (ms): 16815.3 | learning rate: 5.064E-05 | global batch size: 64 | lm loss: 6.341463E+00 | loss scale: 2048.0 | grad norm: 57913.351 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6070/ 159576 | consumed samples: 183728 | elapsed time per iteration (ms): 16825.8 | learning rate: 5.082E-05 | global batch size: 64 | lm loss: 6.336625E+00 | loss scale: 2048.0 | grad norm: 62496.040 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6080/ 159576 | consumed samples: 184368 | elapsed time per iteration (ms): 16749.3 | learning rate: 5.100E-05 | global batch size: 64 | lm loss: 6.378619E+00 | loss scale: 2048.0 | grad norm: 53421.784 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6090/ 159576 | consumed samples: 185008 | elapsed time per iteration (ms): 16844.2 | learning rate: 5.118E-05 | global batch size: 64 | lm loss: 6.363810E+00 | loss scale: 2048.0 | grad norm: 53621.070 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6100/ 159576 | consumed samples: 185648 | elapsed time per iteration (ms): 16803.1 | learning rate: 5.136E-05 | global batch size: 64 | lm loss: 6.397610E+00 | loss scale: 2048.0 | grad norm: 63234.859 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6110/ 159576 | consumed samples: 186288 | elapsed time per iteration (ms): 16808.5 | learning rate: 5.153E-05 | global batch size: 64 | lm loss: 6.359557E+00 | loss scale: 2048.0 | grad norm: 52582.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6120/ 159576 | consumed samples: 186928 | elapsed time per iteration (ms): 16792.9 | learning rate: 5.171E-05 | global batch size: 64 | lm loss: 6.347573E+00 | loss scale: 2048.0 | grad norm: 50959.488 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6130/ 159576 | consumed samples: 187568 | elapsed time per iteration (ms): 16806.7 | learning rate: 5.189E-05 | global batch size: 64 | lm loss: 6.351057E+00 | loss scale: 2048.0 | grad norm: 152670.439 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6140/ 159576 | consumed samples: 188208 | elapsed time per iteration (ms): 16808.0 | learning rate: 5.207E-05 | global batch size: 64 | lm loss: 6.374673E+00 | loss scale: 2048.0 | grad norm: 50742.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 04:08:28] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1185639_[1-10%1] on 'gpu_p13' partition) -[2021-09-25 04:08:28] PULSE: tr8-104B is running for 1:33:00 since 2021-09-25T02:35:28 (1185609 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n[0,7-8]) - iteration 6150/ 159576 | consumed samples: 188848 | elapsed time per iteration (ms): 16696.6 | learning rate: 5.224E-05 | global batch size: 64 | lm loss: 6.323299E+00 | loss scale: 2048.0 | grad norm: 55101.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6160/ 159576 | consumed samples: 189600 | elapsed time per iteration (ms): 17385.3 | learning rate: 5.245E-05 | global batch size: 80 | lm loss: 6.368839E+00 | loss scale: 2048.0 | grad norm: 51296.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6170/ 159576 | consumed samples: 190400 | elapsed time per iteration (ms): 17823.6 | learning rate: 5.267E-05 | global batch size: 80 | lm loss: 6.355129E+00 | loss scale: 2048.0 | grad norm: 85490.491 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6180/ 159576 | consumed samples: 191200 | elapsed time per iteration (ms): 17757.4 | learning rate: 5.289E-05 | global batch size: 80 | lm loss: 6.373211E+00 | loss scale: 2048.0 | grad norm: 112584.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6190/ 159576 | consumed samples: 192000 | elapsed time per iteration (ms): 17583.1 | learning rate: 5.312E-05 | global batch size: 80 | lm loss: 6.372861E+00 | loss scale: 2048.0 | grad norm: 102723.952 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6200/ 159576 | consumed samples: 192800 | elapsed time per iteration (ms): 17380.3 | learning rate: 5.334E-05 | global batch size: 80 | lm loss: 6.336594E+00 | loss scale: 2048.0 | grad norm: 41950.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6210/ 159576 | consumed samples: 193600 | elapsed time per iteration (ms): 17443.3 | learning rate: 5.356E-05 | global batch size: 80 | lm loss: 6.355303E+00 | loss scale: 4096.0 | grad norm: 131493.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 6210 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-25 04:26:31,836] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step6210/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 6210 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 19730.09 -[exiting program after 110.16574249267578 minutes] datetime: 2021-09-25 04:26:46 -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed-------------------------------------------------- .. - compatibleDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] ---------------------------------------------------fused_adam -............. DeepSpeed C++/CUDA extension op report[NO] - .......-------------------------------------------------- -[OKAY]NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninjafused_lamb - ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adamninja ............. ..................[NO] .......[OKAY] -[OKAY] --------------------------------------------------- -op name fused_lamb................ .............installed [NO].. .......compatible -[OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn fused_adam ............. [NO] ................... [NO][OKAY] -....... [OKAY]fused_lamb - ............. [NO] .......transformer [OKAY]............ - [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO]sparse_attn ................... [OKAY][NO] -....... [OKAY] -transformer ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY]quantizer - .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . --------------------------------------------------[NO] ....... --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -[OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninja .................. [OKAY] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -nvcc version ..................... 11.2 -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -fused_adam ............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -utils .................. [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -cpu_adam ............... [YES] ...... [OKAY] -torch version .................... 1.8.1 -fused_adam ............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -fused_lamb ............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -sparse_attn ............ [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer ............ [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -stochastic_transformer . [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ninja............... [YES].................. [OKAY]...... - [OKAY]-------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- - -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible -fused_adam --------------------------------------------------............. -fused_adam ............. [NO] ....... [OKAY] - [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb .............cpu_adam [NO]............... .......[YES] ......[OKAY] -[OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adamsparse_attn ......................... [NO] [NO]....... .......[OKAY] -[OKAY] -fused_lamb transformer............. ............[NO] .......[NO] [OKAY]....... - [OKAY] -stochastic_transformer . [NO] ....... [OKAY]sparse_attn - ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -transformer_inferencetransformer_inference .... [NO] ....... [OKAY] - [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY] -JIT compiled ops requires ninja - ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version ..................... .....................11.2 -11.2 -deepspeed install pathdeepspeed install path ........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam-------------------------------------------------- ............... -[YES] DeepSpeed C++/CUDA extension op report...... -[OKAY]-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lambninja ............. ..................[NO] [OKAY]....... - [OKAY]-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -op name ................ installed .. compatible --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -sparse_attn ............ [NO] .......cpu_adam [OKAY]............... - [YES] ......transformer [OKAY]............ - [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -stochastic_transformerfused_adam .............. [NO][NO] ....... .......[OKAY] -JIT compiled ops requires ninja -[OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -transformer ............ [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -............... torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ...............torch version 11.1.................... -stochastic_transformer . [NO] ....... [OKAY] - 1.8.1nvcc version - ..................... torch cuda version11.2 -...............deepspeed install path 11.1........... - nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - deepspeed info11.2 -...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...... - deepspeed infotorch 1.8, cuda 11.1 -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer quantizer.............. [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -ninja .................. [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch version ................................... 1.8.1 -torch cuda version ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.1 - -nvcc version torch version..................... ....................11.2 -1.8.1deepspeed install path - ........... torch cuda version ...............['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.1deepspeed info - nvcc version................... .....................0.4.2+bc17042, bc17042, big-science -11.2 -deepspeed wheel compiled w. deepspeed install path...... ...........torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportstochastic_transformer - -------------------------------------------------- -. NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.[NO] - --------------------------------------------------....... - JIT compiled ops requires ninja[OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [YES] ...... - [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer_inference .. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -ninja .................. [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -async_io ............... [NO] ....... [NO] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -transformer_inference .. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -ninja .................. [OKAY] -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... [NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... utils[OKAY] -.................. [YES] ...... utils[OKAY] -.................. [YES]quantizer .................... [NO] ....... [OKAY][OKAY] - --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] ....... .......[NO] -[NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ---------------------------------------------------sparse_attn ............ --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [NO]DeepSpeed C++/CUDA extension op report -....... --------------------------------------------------[OKAY] - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -transformer --------------------------------------------------............ - JIT compiled ops requires ninja[NO] -....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -...........deepspeed wheel compiled w. ......['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] torch 1.8, cuda 11.1 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed general environment info: -sparse_attn ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ...................DeepSpeed general environment info: 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed general environment info: -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -stochastic_transformer . [NO] ....... [OKAY] -torch version .................... 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... .................... 1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1 -nvcc version nvcc version..................... .....................11.2 -11.2deepspeed install path - ...........deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... DeepSpeed general environment info:11.2 -deepspeed install path ........... - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed infotorch install path ................... ...............0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO]async_io ....... [NO]............... - [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... transformer_inference[OKAY] -.. [NO] ....... [OKAY]utils --------------------------------------------------- - .................. [YES] ...... [OKAY]utils - .................. [YES]quantizer .................... [OKAY][NO] - ....... quantizer[OKAY] -.............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.-------------------------------------------------- - -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - -utils .................. [YES] ...... [OKAY] ----------------------------------------------------------------------------------------------------- - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -torch version .................... 1.8.1 -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -torch cuda version ............... 11.1 -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -nvcc version ..................... 11.2 -quantizer ..............quantizer [NO].............. ....... [NO][OKAY] -....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inferenceutils .................... [YES] ...... [OKAY] - [NO] ....... [OKAY]quantizer - .............. [NO] ....... [OKAY] -utils .................. [YES]-------------------------------------------------- -...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -DeepSpeed general environment info: -fused_lamb ............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -sparse_attn ............ [NO] ....... [OKAY] -torch cuda version ............... 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -nvcc version ..................... 11.2 -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info: -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -op name ................ installed .. compatible --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -torch cuda versionnvcc version .................................... 11.111.2 - -cpu_adam ............... [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path deepspeed info........... ................... 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -fused_adam ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -torch cuda version ............... 11.1 -JIT compiled ops requires ninja -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -ninja .................. [OKAY] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - --------------------------------------------------- -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -op name ................ installed .. compatible --------------------------------------------------- -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_lamb ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -sparse_attn ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - ...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninja .................. [OKAY] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -op name ................ installed .. compatible --------------------------------------------------- -nvcc version ..................... 11.2 -cpu_adam ............... [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -fused_adam ............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed C++/CUDA extension op report -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -op name ................ installed .. compatible -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... [OKAY] -torch version .................... 1.8.1 -fused_lamb ............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -transformer_inference .. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -nvcc version ..................... 11.2 -utils .................. [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -DeepSpeed general environment info: -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] -torch version .................... 1.8.1 -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -torch cuda version ............... 11.1 -transformer_inference .. [NO] ....... [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -nvcc version ..................... 11.2 -utils .................. [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer .............. [NO] ....... [OKAY] -[OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninja .................. [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -op name ................ installed .. compatible -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -quantizer .............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -ninja .................. [OKAY] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed general environment info: -fused_adam ............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -transformer ............ [NO] ....... [OKAY] -nvcc version ..................... 11.2 -stochastic_transformer . [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] -torch version .................... 1.8.1 -async_io ............... [NO] ....... [NO] -torch cuda version ............... 11.1 -transformer_inference .. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -transformer_inference .. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed general environment info: --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -op name ................ installed .. compatible --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -JIT compiled ops requires ninjaninja - .................. [OKAY] --------------------------------------------------- -torch cuda version ............... 11.1 -op name ................ installed .. compatible --------------------------------------------------- -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adam ............... [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY]ninja - .................. [OKAY] --------------------------------------------------- -op name ................ installedfused_adam ............... compatible[NO] - .......-------------------------------------------------- -[OKAY] -fused_lamb ............. [NO]cpu_adam ...................... [OKAY][YES] - ...... [OKAY] -sparse_attn fused_adam............ .............[NO] [NO]....... .......[OKAY] -[OKAY] -transformer ............fused_lamb [NO]............. .......[NO] [OKAY]....... -ninja .................. [OKAY] - [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -ninja .................. [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -op name ................ installed .. compatible --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch cuda version ............... 11.1 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -nvcc version ..................... 11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info: -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version ............... 11.1 -async_io ............... [NO] ....... [NO] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 -quantizer .............. [NO] ....... [OKAY] -nvcc versionDeepSpeed general environment info: ..................... 11.2 - --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -deepspeed install path ........... torch install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info .................................. 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -JIT compiled ops requires ninja -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -async_io ............... [NO] ....... [NO] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -transformer_inference .. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer ..............quantizer [NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -ninja .................. [OKAY] -nvcc version ..................... 11.2 --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -cpu_adam ............... [YES] ...... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed general environment info: -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -stochastic_transformer . [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja ninja.................. [OKAY].................. - --------------------------------------------------[OKAY] - -op name --------------------------------------------------................ - op nameinstalled .................. compatibleinstalled - -------------------------------------------------- -.. compatible --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam...... [OKAY]............... - [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam .............fused_lamb .............[NO] [NO]....... ....... [OKAY][OKAY] - -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformersparse_attn ........................ [NO] [NO]....... .......[OKAY] - [OKAY] -stochastic_transformertransformer ............. [NO][NO] ....... .......[OKAY] - [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info:DeepSpeed general environment info: - -transformer_inference .. [NO] ....... [OKAY] -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version torch cuda version............... ...............11.1 11.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -quantizer .............. [NO] ....... [OKAY] -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - --------------------------------------------------- -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY]ninja - .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path torch version............... .................... 1.8.1 -torch cuda version ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -11.1 -nvcc versiontorch version ......................................... 11.21.8.1 - -deepspeed install path torch cuda version........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1 - -deepspeed infonvcc version ........................................ 11.20.4.2+bc17042, bc17042, big-science - -deepspeed install path deepspeed wheel compiled w............ ...... torch 1.8, cuda 11.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ...............async_io [NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY]quantizer - .............. [NO] .......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer_inference [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. .. [NO] - ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -async_ioquantizer ............................. [NO][NO] .............. [NO][OKAY] - -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -utils .................. [YES] ...... [OKAY] -torch version .................... 1.8.1 -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed general environment info:deepspeed install path ........... -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ...................torch install path 0.4.2+bc17042, bc17042, big-science -...............deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch version .................... ...............1.8.1 - torch cuda version ............... 11.1 -nvcc version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']..................... -11.2 -JIT compiled ops requires ninja -deepspeed install path torch version........... ....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -1.8.1deepspeed info - ................... 0.4.2+bc17042, bc17042, big-sciencetorch cuda version - deepspeed wheel compiled w................ ...... 11.1torch 1.8, cuda 11.1 - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:torch install path -............... torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda versiontorch version ................................... 11.11.8.1 - -nvcc version torch cuda version..................... ...............11.2 -11.1deepspeed install path - nvcc version........... ..................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.2 - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO] - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed general environment info: -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -torch install path ............... DeepSpeed general environment info: -sparse_attn ............ [NO] ....... [OKAY] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch install pathtorch version ................................... 1.8.1 -torch cuda version ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -11.1 -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -torch versionnvcc version ......................................... 1.8.111.2 - -stochastic_transformer . [NO] ....... [OKAY] -deepspeed install path torch cuda version........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1 - -deepspeed infonvcc version ........................................ 0.4.2+bc17042, bc17042, big-science11.2 - -deepspeed wheel compiled w.deepspeed install path ................. torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO] ...................... [NO][NO] -....... [NO] -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ......quantizer [OKAY].............. - [NO] ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -/bin/sh: line 0: type: git: not found --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninja .................. [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... [OKAY] - -utils .................. [YES] utils...... [OKAY] - .................. [YES]quantizer .................... [OKAY][NO] ....... -[OKAY] -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... 1.8.1torch version - .................... torch cuda version1.8.1 -............... 11.1torch cuda version - nvcc version............... .....................11.1 -DeepSpeed general environment info: -11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -........... deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed info -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... torch version1.8.1 -.................... torch cuda version1.8.1 - ...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - torch 1.8, cuda 11.1deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 -............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info: -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 -quantizer .............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -JIT compiled ops requires ninja -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 -quantizer .............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- - -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -JIT compiled ops requires ninja -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja --------------------------------------------------.................. [OKAY] - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------op name - NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op................. - --------------------------------------------------installed - JIT compiled ops requires ninja.. - compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info:DeepSpeed general environment info:torch cuda version -............... -11.1 -nvcc version torch install path.....................torch install path 11.2............... - deepspeed install path............... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed info - ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']................... -torch version 0.4.2+bc17042, bc17042, big-sciencetorch version.................... - deepspeed wheel compiled w.....................1.8.1 -......1.8.1 -torch cuda versiontorch 1.8, cuda 11.1 -torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... DeepSpeed general environment info:0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. -...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] .................... - 1.8.1 -torch version ....................torch cuda version ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.211.1 - -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version ............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -ninja .................. [OKAY] -torch cuda version ............... 11.1 --------------------------------------------------- -nvcc version ..................... 11.2 -op name ................ installed .. compatible --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -cpu_adam ............... [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -fused_adam ............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -DeepSpeed general environment info: -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - .....................nvcc version 11.2..................... -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -async_io ............... [NO] ....... [NO] -nvcc versionnvcc version .......................................... 11.211.2 - -transformer_inference .. [NO] ....... [OKAY] -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science -utils .................. [YES] ...... [OKAY] -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -async_io-------------------------------------------------- -............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -/bin/sh: line 0: type: git: not found -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils utils.................. [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. [NO] quantizer....... ..............[OKAY] -[NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO]ninja ....... ..................[OKAY] -[OKAY] -transformer ............ [NO] --------------------------------------------------....... - [OKAY]op name - ................ installedstochastic_transformer .. .compatible -[NO]-------------------------------------------------- -....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info: -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch version ................................... 1.8.1 -torch cuda version ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.1 - -nvcc versiontorch version ......................................... 11.21.8.1 - -deepspeed install path torch cuda version........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1 - -deepspeed infonvcc version ........................................ 0.4.2+bc17042, bc17042, big-science11.2 - -deepspeed wheel compiled w.deepspeed install path ................. torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version ....................torch cuda version 1.8.1............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - ...........deepspeed info ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_ioutils ................................. [NO][YES] ............. [NO][OKAY] - -quantizer .............. [NO] ....... [OKAY] -transformer_inference-------------------------------------------------- -.. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path -............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] 1.8.1 - -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc versiontorch cuda version .................................... 11.211.1 - -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install path deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO]utils ......................... [OKAY][YES] - ...... [OKAY] -utils quantizer.................. ..............[YES] [NO]...... ....... [OKAY] -[OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.DeepSpeed general environment info: ...... torch 1.8, cuda 11.1 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -using world size: 256, data-parallel-size: 8, tensor-model-parallel size: 4, pipeline-model-parallel size: 8 -using torch.float16 for parameters ... ------------------------- arguments ------------------------ - accumulate_allreduce_grads_in_fp32 .............. False - adam_beta1 ...................................... 0.9 - adam_beta2 ...................................... 0.999 - adam_eps ........................................ 1e-08 - adlr_autoresume ................................. False - adlr_autoresume_interval ........................ 1000 - apply_query_key_layer_scaling ................... True - apply_residual_connection_post_layernorm ........ False - attention_dropout ............................... 0.1 - attention_softmax_in_fp32 ....................... False - bert_binary_head ................................ True - bert_load ....................................... None - bf16 ............................................ False - bias_dropout_fusion ............................. True - bias_gelu_fusion ................................ True - biencoder_projection_dim ........................ 0 - biencoder_shared_query_context_model ............ False - block_data_path ................................. None - checkpoint_activations .......................... True - checkpoint_in_cpu ............................... False - checkpoint_num_layers ........................... 1 - clip_grad ....................................... 1.0 - codecarbon_dir .................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/codecarbon - consumed_train_samples .......................... 0 - consumed_valid_samples .......................... 0 - contigious_checkpointing ........................ False - cpu_optimizer ................................... False - cpu_torch_adam .................................. False - data_impl ....................................... mmap - data_parallel_size .............................. 8 - data_path ....................................... ['/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document'] - dataloader_type ................................. single - DDP_impl ........................................ local - decoder_seq_length .............................. None - deepscale ....................................... False - deepscale_config ................................ None - deepspeed ....................................... True - deepspeed_activation_checkpointing .............. True - deepspeed_config ................................ ./ds_config.1186600.json - deepspeed_mpi ................................... False - distribute_checkpointed_activations ............. False - distributed_backend ............................. nccl - embedding_path .................................. None - encoder_seq_length .............................. 2048 - eod_mask_loss ................................... False - eval_interval ................................... 1000 - eval_iters ...................................... 5 - evidence_data_path .............................. None - exit_duration_in_mins ........................... 1190 - exit_interval ................................... None - ffn_hidden_size ................................. 20480 - finetune ........................................ False - fp16 ............................................ True - fp16_lm_cross_entropy ........................... False - fp32_residual_connection ........................ False - global_batch_size ............................... 2048 - hidden_dropout .................................. 0.1 - hidden_size ..................................... 16384 - hysteresis ...................................... 2 - ict_head_size ................................... None - ict_load ........................................ None - img_dim ......................................... 224 - indexer_batch_size .............................. 128 - indexer_log_interval ............................ 1000 - init_method_std ................................. 0.02 - init_method_xavier_uniform ...................... False - initial_loss_scale .............................. 4294967296 - kv_channels ..................................... 512 - layernorm_epsilon ............................... 1e-05 - lazy_mpu_init ................................... None - load ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - local_rank ...................................... 0 - log_batch_size_to_tensorboard ................... True - log_interval .................................... 10 - log_learning_rate_to_tensorboard ................ True - log_loss_scale_to_tensorboard ................... True - log_num_zeros_in_grad ........................... False - log_params_norm ................................. False - log_timers_to_tensorboard ....................... True - log_validation_ppl_to_tensorboard ............... True - loss_scale ...................................... 12.0 - loss_scale_window ............................... 1000 - lr .............................................. 6e-05 - lr_decay_iters .................................. None - lr_decay_samples ................................ 126953125 - lr_decay_style .................................. cosine - lr_warmup_fraction .............................. None - lr_warmup_iters ................................. 0 - lr_warmup_samples ............................... 216320 - make_vocab_size_divisible_by .................... 128 - mask_prob ....................................... 0.15 - masked_softmax_fusion ........................... True - max_position_embeddings ......................... 2048 - memory_centric_tiled_linear ..................... False - merge_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt - micro_batch_size ................................ 1 - min_loss_scale .................................. 1.0 - min_lr .......................................... 6e-06 - mmap_warmup ..................................... False - no_load_optim ................................... None - no_load_rng ..................................... None - no_save_optim ................................... None - no_save_rng ..................................... None - num_attention_heads ............................. 32 - num_channels .................................... 3 - num_classes ..................................... 1000 - num_layers ...................................... 32 - num_layers_per_virtual_pipeline_stage ........... None - num_workers ..................................... 2 - onnx_safe ....................................... None - openai_gelu ..................................... False - optimizer ....................................... adam - override_lr_scheduler ........................... False - params_dtype .................................... torch.float16 - partition_activations ........................... False - patch_dim ....................................... 16 - pipeline_model_parallel_size .................... 8 - position_embedding_type ......................... PositionEmbeddingType.absolute - profile_backward ................................ False - query_in_block_prob ............................. 0.1 - rampup_batch_size ............................... ['16', '16', '6_000_000'] - rank ............................................ 0 - remote_device ................................... none - reset_attention_mask ............................ False - reset_position_ids .............................. False - retriever_report_topk_accuracies ................ [] - retriever_score_scaling ......................... False - retriever_seq_length ............................ 256 - sample_rate ..................................... 1.0 - save ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - save_interval ................................... 1500 - scatter_gather_tensors_in_pipeline .............. True - scattered_embeddings ............................ False - seed ............................................ 42 - seq_length ...................................... 2048 - sgd_momentum .................................... 0.9 - short_seq_prob .................................. 0.1 - split ........................................... 949,50,1 - split_transformers .............................. False - synchronize_each_layer .......................... False - tensor_model_parallel_size ...................... 4 - tensorboard_dir ................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/tensorboard - tensorboard_log_interval ........................ 1 - tensorboard_queue_size .......................... 5 - tile_factor ..................................... 1 - titles_data_path ................................ None - tokenizer_name_or_path .......................... None - tokenizer_type .................................. GPT2BPETokenizer - train_iters ..................................... None - train_samples ................................... 300000000 - use_checkpoint_lr_scheduler ..................... False - use_contiguous_buffers_in_ddp ................... False - use_cpu_initialization .......................... None - use_one_sent_docs ............................... False - use_pin_memory .................................. False - virtual_pipeline_model_parallel_size ............ None - vocab_extra_ids ................................. 0 - vocab_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json - weight_decay .................................... 0.1 - world_size ...................................... 256 - zero_allgather_bucket_size ...................... 0.0 - zero_contigious_gradients ....................... False - zero_reduce_bucket_size ......................... 0.0 - zero_reduce_scatter ............................. False - zero_stage ...................................... 1 --------------------- end of arguments --------------------- -will use batch size rampup starting from global batch size 16 to global batch size 2048 with batch size increments 16 over 6000000 samples. -> building GPT2BPETokenizer tokenizer ... -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) -> setting tensorboard ... -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................DeepSpeed general environment info: 1.8.1 - -torch cuda version ............... 11.1 -torch install pathnvcc version .................................... 11.2 -deepspeed install path ........... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infotorch version ....................................... 0.4.2+bc17042, bc17042, big-science1.8.1 - -deepspeed wheel compiled w. ......torch cuda version torch 1.8, cuda 11.1............... - 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> setting codecarbon ... -> initializing torch distributed ... --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name................ ................ ................ installed................ installed installedinstalled.... ....compatiblecompatible - -compatiblecompatible---------------------------------------------------------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adamcpu_adamcpu_adam ............................................................ [YES][YES][YES][YES] .................. ...... [OKAY][OKAY][OKAY] - - -[OKAY] -fused_adam fused_adam.............fused_adam fused_adam [NO].......................... ............. [NO] [NO][NO]....... ..............[OKAY]....... -[OKAY][OKAY] - -[OKAY]fused_lamb - .............fused_lamb fused_lambfused_lamb[NO]............. .................................[NO] [NO][OKAY][NO]....... - ....... [OKAY] ....... -[OKAY] -[OKAY] -sparse_attn ............ [NO] .......sparse_attn [OKAY]............ -sparse_attn sparse_attn [NO]transformer ............ ............ ............ [NO]....... [NO][NO] .......[OKAY].............. - [OKAY][OKAY][OKAY] - -transformer - ............transformertransformer stochastic_transformer[NO]........................ ....... . [NO][NO] [OKAY][NO]....... - ..............[OKAY] -[OKAY][OKAY]stochastic_transformer - -.stochastic_transformer stochastic_transformer[NO] ........ . [NO] [OKAY] [NO]....... - .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaJIT compiled ops requires ninja - - --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name................................................ ................installedinstalledinstalled installed...... compatible..compatiblecompatible - - ---------------------------------------------------compatible---------------------------------------------------------------------------------------------------- - - - --------------------------------------------------- -cpu_adamcpu_adamcpu_adam ............... ............... ...............[YES]cpu_adam [YES] [YES] ...... ............... ............ [OKAY] [YES] -[OKAY][OKAY] - -...... [OKAY] -fused_adam .............fused_adamfused_adam fused_adam[NO]............. ............. .................... [NO] [NO][NO] [OKAY] ....... -.............. [OKAY] fused_lamb[OKAY][OKAY] - - -............. [NO]fused_lambfused_lambfused_lamb .............................................. [OKAY][NO][NO][NO] - ..................... [OKAY][OKAY][OKAY] - - -sparse_attn ............ [NO] ....... sparse_attnsparse_attnsparse_attn[OKAY] -........................ ............[NO][NO] [NO]transformer.............. ...................[OKAY] [OKAY][OKAY] - -[NO] - transformertransformer.......transformer ............ ............ [OKAY] ............ -[NO][NO] [NO].............. stochastic_transformer....... [OKAY] [OKAY] -[OKAY] -. - [NO]stochastic_transformerstochastic_transformer stochastic_transformer ........ .[OKAY].[NO] - [NO][NO] .............. .......[OKAY][OKAY] - -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.transformer_inference - .. [NO] ....... [OKAY] -utils .................. [YES] async_io...... ...............[OKAY] -[NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... DeepSpeed general environment info:0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. -...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> initializing tensor model parallel with size 4 -> initializing pipeline model parallel with size 8 -> setting random seeds to 42 ... -[2021-09-25 04:27:14,118] [INFO] [checkpointing.py:226:model_parallel_cuda_manual_seed] > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 2760 and data parallel seed: 42 -> compiling dataset index builder ... -make: Entering directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' -make: Nothing to be done for 'default'. -make: Leaving directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' ->>> done with dataset index builder. Compilation time: 0.302 seconds -> compiling and loading fused kernels ... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_upper_triang_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_upper_triang_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module fused_mix_prec_layer_norm_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module fused_mix_prec_layer_norm_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( ->>> done with compiling and loading fused kernels. Compilation time: 23.417 seconds -time to initialize megatron (seconds): 25.843 -[after megatron is initialized] datetime: 2021-09-25 04:27:37 -building GPT model ... -[2021-09-25 04:27:37,906] [INFO] [utils.py:680:see_memory_usage] Before Building Model -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:373: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved - warnings.warn( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:381: FutureWarning: torch.cuda.max_memory_cached has been renamed to torch.cuda.max_memory_reserved - warnings.warn( -[2021-09-25 04:27:37,908] [INFO] [utils.py:681:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB -[2021-09-25 04:27:37,908] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 36.69 GB, percent = 19.6% -SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None -Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=1, data=0, model=0): 32, ProcessCoord(pipe=1, data=0, model=1): 33, ProcessCoord(pipe=1, data=0, model=2): 34, ProcessCoord(pipe=1, data=0, model=3): 35, ProcessCoord(pipe=1, data=1, model=0): 36, ProcessCoord(pipe=1, data=1, model=1): 37, ProcessCoord(pipe=1, data=1, model=2): 38, ProcessCoord(pipe=1, data=1, model=3): 39, ProcessCoord(pipe=1, data=2, model=0): 40, ProcessCoord(pipe=1, data=2, model=1): 41, ProcessCoord(pipe=1, data=2, model=2): 42, ProcessCoord(pipe=1, data=2, model=3): 43, ProcessCoord(pipe=1, data=3, model=0): 44, ProcessCoord(pipe=1, data=3, model=1): 45, ProcessCoord(pipe=1, data=3, model=2): 46, ProcessCoord(pipe=1, data=3, model=3): 47, ProcessCoord(pipe=1, data=4, model=0): 48, ProcessCoord(pipe=1, data=4, model=1): 49, ProcessCoord(pipe=1, data=4, model=2): 50, ProcessCoord(pipe=1, data=4, model=3): 51, ProcessCoord(pipe=1, data=5, model=0): 52, ProcessCoord(pipe=1, data=5, model=1): 53, ProcessCoord(pipe=1, data=5, model=2): 54, ProcessCoord(pipe=1, data=5, model=3): 55, ProcessCoord(pipe=1, data=6, model=0): 56, ProcessCoord(pipe=1, data=6, model=1): 57, ProcessCoord(pipe=1, data=6, model=2): 58, ProcessCoord(pipe=1, data=6, model=3): 59, ProcessCoord(pipe=1, data=7, model=0): 60, ProcessCoord(pipe=1, data=7, model=1): 61, ProcessCoord(pipe=1, data=7, model=2): 62, ProcessCoord(pipe=1, data=7, model=3): 63, ProcessCoord(pipe=2, data=0, model=0): 64, ProcessCoord(pipe=2, data=0, model=1): 65, ProcessCoord(pipe=2, data=0, model=2): 66, ProcessCoord(pipe=2, data=0, model=3): 67, ProcessCoord(pipe=2, data=1, model=0): 68, ProcessCoord(pipe=2, data=1, model=1): 69, ProcessCoord(pipe=2, data=1, model=2): 70, ProcessCoord(pipe=2, data=1, model=3): 71, ProcessCoord(pipe=2, data=2, model=0): 72, ProcessCoord(pipe=2, data=2, model=1): 73, ProcessCoord(pipe=2, data=2, model=2): 74, ProcessCoord(pipe=2, data=2, model=3): 75, ProcessCoord(pipe=2, data=3, model=0): 76, ProcessCoord(pipe=2, data=3, model=1): 77, ProcessCoord(pipe=2, data=3, model=2): 78, ProcessCoord(pipe=2, data=3, model=3): 79, ProcessCoord(pipe=2, data=4, model=0): 80, ProcessCoord(pipe=2, data=4, model=1): 81, ProcessCoord(pipe=2, data=4, model=2): 82, ProcessCoord(pipe=2, data=4, model=3): 83, ProcessCoord(pipe=2, data=5, model=0): 84, ProcessCoord(pipe=2, data=5, model=1): 85, ProcessCoord(pipe=2, data=5, model=2): 86, ProcessCoord(pipe=2, data=5, model=3): 87, ProcessCoord(pipe=2, data=6, model=0): 88, ProcessCoord(pipe=2, data=6, model=1): 89, ProcessCoord(pipe=2, data=6, model=2): 90, ProcessCoord(pipe=2, data=6, model=3): 91, ProcessCoord(pipe=2, data=7, model=0): 92, ProcessCoord(pipe=2, data=7, model=1): 93, ProcessCoord(pipe=2, data=7, model=2): 94, ProcessCoord(pipe=2, data=7, model=3): 95, ProcessCoord(pipe=3, data=0, model=0): 96, ProcessCoord(pipe=3, data=0, model=1): 97, ProcessCoord(pipe=3, data=0, model=2): 98, ProcessCoord(pipe=3, data=0, model=3): 99, ProcessCoord(pipe=3, data=1, model=0): 100, ProcessCoord(pipe=3, data=1, model=1): 101, ProcessCoord(pipe=3, data=1, model=2): 102, ProcessCoord(pipe=3, data=1, model=3): 103, ProcessCoord(pipe=3, data=2, model=0): 104, ProcessCoord(pipe=3, data=2, model=1): 105, ProcessCoord(pipe=3, data=2, model=2): 106, ProcessCoord(pipe=3, data=2, model=3): 107, ProcessCoord(pipe=3, data=3, model=0): 108, ProcessCoord(pipe=3, data=3, model=1): 109, ProcessCoord(pipe=3, data=3, model=2): 110, ProcessCoord(pipe=3, data=3, model=3): 111, ProcessCoord(pipe=3, data=4, model=0): 112, ProcessCoord(pipe=3, data=4, model=1): 113, ProcessCoord(pipe=3, data=4, model=2): 114, ProcessCoord(pipe=3, data=4, model=3): 115, ProcessCoord(pipe=3, data=5, model=0): 116, ProcessCoord(pipe=3, data=5, model=1): 117, ProcessCoord(pipe=3, data=5, model=2): 118, ProcessCoord(pipe=3, data=5, model=3): 119, ProcessCoord(pipe=3, data=6, model=0): 120, ProcessCoord(pipe=3, data=6, model=1): 121, ProcessCoord(pipe=3, data=6, model=2): 122, ProcessCoord(pipe=3, data=6, model=3): 123, ProcessCoord(pipe=3, data=7, model=0): 124, ProcessCoord(pipe=3, data=7, model=1): 125, ProcessCoord(pipe=3, data=7, model=2): 126, ProcessCoord(pipe=3, data=7, model=3): 127, ProcessCoord(pipe=4, data=0, model=0): 128, ProcessCoord(pipe=4, data=0, model=1): 129, ProcessCoord(pipe=4, data=0, model=2): 130, ProcessCoord(pipe=4, data=0, model=3): 131, ProcessCoord(pipe=4, data=1, model=0): 132, ProcessCoord(pipe=4, data=1, model=1): 133, ProcessCoord(pipe=4, data=1, model=2): 134, ProcessCoord(pipe=4, data=1, model=3): 135, ProcessCoord(pipe=4, data=2, model=0): 136, ProcessCoord(pipe=4, data=2, model=1): 137, ProcessCoord(pipe=4, data=2, model=2): 138, ProcessCoord(pipe=4, data=2, model=3): 139, ProcessCoord(pipe=4, data=3, model=0): 140, ProcessCoord(pipe=4, data=3, model=1): 141, ProcessCoord(pipe=4, data=3, model=2): 142, ProcessCoord(pipe=4, data=3, model=3): 143, ProcessCoord(pipe=4, data=4, model=0): 144, ProcessCoord(pipe=4, data=4, model=1): 145, ProcessCoord(pipe=4, data=4, model=2): 146, ProcessCoord(pipe=4, data=4, model=3): 147, ProcessCoord(pipe=4, data=5, model=0): 148, ProcessCoord(pipe=4, data=5, model=1): 149, ProcessCoord(pipe=4, data=5, model=2): 150, ProcessCoord(pipe=4, data=5, model=3): 151, ProcessCoord(pipe=4, data=6, model=0): 152, ProcessCoord(pipe=4, data=6, model=1): 153, ProcessCoord(pipe=4, data=6, model=2): 154, ProcessCoord(pipe=4, data=6, model=3): 155, ProcessCoord(pipe=4, data=7, model=0): 156, ProcessCoord(pipe=4, data=7, model=1): 157, ProcessCoord(pipe=4, data=7, model=2): 158, ProcessCoord(pipe=4, data=7, model=3): 159, ProcessCoord(pipe=5, data=0, model=0): 160, ProcessCoord(pipe=5, data=0, model=1): 161, ProcessCoord(pipe=5, data=0, model=2): 162, ProcessCoord(pipe=5, data=0, model=3): 163, ProcessCoord(pipe=5, data=1, model=0): 164, ProcessCoord(pipe=5, data=1, model=1): 165, ProcessCoord(pipe=5, data=1, model=2): 166, ProcessCoord(pipe=5, data=1, model=3): 167, ProcessCoord(pipe=5, data=2, model=0): 168, ProcessCoord(pipe=5, data=2, model=1): 169, ProcessCoord(pipe=5, data=2, model=2): 170, ProcessCoord(pipe=5, data=2, model=3): 171, ProcessCoord(pipe=5, data=3, model=0): 172, ProcessCoord(pipe=5, data=3, model=1): 173, ProcessCoord(pipe=5, data=3, model=2): 174, ProcessCoord(pipe=5, data=3, model=3): 175, ProcessCoord(pipe=5, data=4, model=0): 176, ProcessCoord(pipe=5, data=4, model=1): 177, ProcessCoord(pipe=5, data=4, model=2): 178, ProcessCoord(pipe=5, data=4, model=3): 179, ProcessCoord(pipe=5, data=5, model=0): 180, ProcessCoord(pipe=5, data=5, model=1): 181, ProcessCoord(pipe=5, data=5, model=2): 182, ProcessCoord(pipe=5, data=5, model=3): 183, ProcessCoord(pipe=5, data=6, model=0): 184, ProcessCoord(pipe=5, data=6, model=1): 185, ProcessCoord(pipe=5, data=6, model=2): 186, ProcessCoord(pipe=5, data=6, model=3): 187, ProcessCoord(pipe=5, data=7, model=0): 188, ProcessCoord(pipe=5, data=7, model=1): 189, ProcessCoord(pipe=5, data=7, model=2): 190, ProcessCoord(pipe=5, data=7, model=3): 191, ProcessCoord(pipe=6, data=0, model=0): 192, ProcessCoord(pipe=6, data=0, model=1): 193, ProcessCoord(pipe=6, data=0, model=2): 194, ProcessCoord(pipe=6, data=0, model=3): 195, ProcessCoord(pipe=6, data=1, model=0): 196, ProcessCoord(pipe=6, data=1, model=1): 197, ProcessCoord(pipe=6, data=1, model=2): 198, ProcessCoord(pipe=6, data=1, model=3): 199, ProcessCoord(pipe=6, data=2, model=0): 200, ProcessCoord(pipe=6, data=2, model=1): 201, ProcessCoord(pipe=6, data=2, model=2): 202, ProcessCoord(pipe=6, data=2, model=3): 203, ProcessCoord(pipe=6, data=3, model=0): 204, ProcessCoord(pipe=6, data=3, model=1): 205, ProcessCoord(pipe=6, data=3, model=2): 206, ProcessCoord(pipe=6, data=3, model=3): 207, ProcessCoord(pipe=6, data=4, model=0): 208, ProcessCoord(pipe=6, data=4, model=1): 209, ProcessCoord(pipe=6, data=4, model=2): 210, ProcessCoord(pipe=6, data=4, model=3): 211, ProcessCoord(pipe=6, data=5, model=0): 212, ProcessCoord(pipe=6, data=5, model=1): 213, ProcessCoord(pipe=6, data=5, model=2): 214, ProcessCoord(pipe=6, data=5, model=3): 215, ProcessCoord(pipe=6, data=6, model=0): 216, ProcessCoord(pipe=6, data=6, model=1): 217, ProcessCoord(pipe=6, data=6, model=2): 218, ProcessCoord(pipe=6, data=6, model=3): 219, ProcessCoord(pipe=6, data=7, model=0): 220, ProcessCoord(pipe=6, data=7, model=1): 221, ProcessCoord(pipe=6, data=7, model=2): 222, ProcessCoord(pipe=6, data=7, model=3): 223, ProcessCoord(pipe=7, data=0, model=0): 224, ProcessCoord(pipe=7, data=0, model=1): 225, ProcessCoord(pipe=7, data=0, model=2): 226, ProcessCoord(pipe=7, data=0, model=3): 227, ProcessCoord(pipe=7, data=1, model=0): 228, ProcessCoord(pipe=7, data=1, model=1): 229, ProcessCoord(pipe=7, data=1, model=2): 230, ProcessCoord(pipe=7, data=1, model=3): 231, ProcessCoord(pipe=7, data=2, model=0): 232, ProcessCoord(pipe=7, data=2, model=1): 233, ProcessCoord(pipe=7, data=2, model=2): 234, ProcessCoord(pipe=7, data=2, model=3): 235, ProcessCoord(pipe=7, data=3, model=0): 236, ProcessCoord(pipe=7, data=3, model=1): 237, ProcessCoord(pipe=7, data=3, model=2): 238, ProcessCoord(pipe=7, data=3, model=3): 239, ProcessCoord(pipe=7, data=4, model=0): 240, ProcessCoord(pipe=7, data=4, model=1): 241, ProcessCoord(pipe=7, data=4, model=2): 242, ProcessCoord(pipe=7, data=4, model=3): 243, ProcessCoord(pipe=7, data=5, model=0): 244, ProcessCoord(pipe=7, data=5, model=1): 245, ProcessCoord(pipe=7, data=5, model=2): 246, ProcessCoord(pipe=7, data=5, model=3): 247, ProcessCoord(pipe=7, data=6, model=0): 248, ProcessCoord(pipe=7, data=6, model=1): 249, ProcessCoord(pipe=7, data=6, model=2): 250, ProcessCoord(pipe=7, data=6, model=3): 251, ProcessCoord(pipe=7, data=7, model=0): 252, ProcessCoord(pipe=7, data=7, model=1): 253, ProcessCoord(pipe=7, data=7, model=2): 254, ProcessCoord(pipe=7, data=7, model=3): 255} -[2021-09-25 04:27:39,312] [INFO] [module.py:360:_partition_layers] Partitioning pipeline stages with method type:transformer -stage=0 layers=7 - 0: _to_float16 - 1: EmbeddingPipe - 2: - 3: ParallelTransformerLayerPipe - 4: ParallelTransformerLayerPipe - 5: ParallelTransformerLayerPipe - 6: ParallelTransformerLayerPipe -stage=1 layers=4 - 7: ParallelTransformerLayerPipe - 8: ParallelTransformerLayerPipe - 9: ParallelTransformerLayerPipe - 10: ParallelTransformerLayerPipe -stage=2 layers=4 - 11: ParallelTransformerLayerPipe - 12: ParallelTransformerLayerPipe - 13: ParallelTransformerLayerPipe - 14: ParallelTransformerLayerPipe -stage=3 layers=4 - 15: ParallelTransformerLayerPipe - 16: ParallelTransformerLayerPipe - 17: ParallelTransformerLayerPipe - 18: ParallelTransformerLayerPipe -stage=4 layers=4 - 19: ParallelTransformerLayerPipe - 20: ParallelTransformerLayerPipe - 21: ParallelTransformerLayerPipe - 22: ParallelTransformerLayerPipe -stage=5 layers=4 - 23: ParallelTransformerLayerPipe - 24: ParallelTransformerLayerPipe - 25: ParallelTransformerLayerPipe - 26: ParallelTransformerLayerPipe -stage=6 layers=4 - 27: ParallelTransformerLayerPipe - 28: ParallelTransformerLayerPipe - 29: ParallelTransformerLayerPipe - 30: ParallelTransformerLayerPipe -stage=7 layers=8 - 31: ParallelTransformerLayerPipe - 32: ParallelTransformerLayerPipe - 33: ParallelTransformerLayerPipe - 34: ParallelTransformerLayerPipe - 35: - 36: MixedFusedLayerNorm - 37: EmbeddingPipe - 38: float16_to_fp32 - loss: CrossEntropy - > number of parameters on (tensor, pipeline) model parallel rank (2, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (0, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (1, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (2, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (1, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (3, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 7): 1986498560 -[2021-09-25 04:27:40,518] [INFO] [utils.py:680:see_memory_usage] After Building Model -[2021-09-25 04:27:40,519] [INFO] [utils.py:681:see_memory_usage] MA 3.77 GB Max_MA 3.79 GB CA 3.79 GB Max_CA 4 GB -[2021-09-25 04:27:40,519] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 36.87 GB, percent = 19.7% - > number of parameters on (tensor, pipeline) model parallel rank (0, 0): 1986465792 -setting training iterations to 159576 -> learning rate decay style: cosine -DeepSpeed is enabled. -[2021-09-25 04:27:40,540] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.4.2+bc17042, git-hash=bc17042, git-branch=big-science -[2021-09-25 04:27:40,690] [INFO] [engine.py:179:__init__] DeepSpeed Flops Profiler Enabled: False -[2021-09-25 04:27:40,690] [INFO] [engine.py:736:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer -[2021-09-25 04:27:40,690] [INFO] [engine.py:741:_configure_optimizer] Using client Optimizer as basic optimizer -[2021-09-25 04:27:40,690] [INFO] [engine.py:750:_configure_optimizer] DeepSpeed Basic Optimizer = FusedAdam -[2021-09-25 04:27:40,690] [INFO] [utils.py:43:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= -[2021-09-25 04:27:40,690] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 1 optimizer -[2021-09-25 04:27:40,690] [INFO] [stage2.py:106:__init__] Reduce bucket size 500000000 -[2021-09-25 04:27:40,690] [INFO] [stage2.py:107:__init__] Allgather bucket size 500000000 -[2021-09-25 04:27:40,691] [INFO] [stage2.py:108:__init__] CPU Offload: False -[2021-09-25 04:27:40,691] [INFO] [stage2.py:109:__init__] Round robin gradient partitioning: False -[2021-09-25 04:27:45,267] [INFO] [stage2.py:419:__init__] optimizer state initialized -[2021-09-25 04:27:45,267] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam -[2021-09-25 04:27:45,267] [INFO] [engine.py:553:_configure_lr_scheduler] DeepSpeed using client LR scheduler -[2021-09-25 04:27:45,267] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = -[2021-09-25 04:27:45,267] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999)] -[2021-09-25 04:27:45,267] [INFO] [config.py:900:print] DeepSpeedEngine configuration: -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] activation_checkpointing_config { - "partition_activations": false, - "contiguous_memory_optimization": false, - "cpu_checkpointing": false, - "number_checkpoints": null, - "synchronize_checkpoint_boundary": false, - "profile": false -} -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] allreduce_always_fp32 ........ False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] amp_enabled .................. False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] amp_params ................... False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] checkpoint_tag_validation_enabled True -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] checkpoint_tag_validation_fail False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] disable_allgather ............ False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] dump_state ................... False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 500, 'delayed_shift': 2, 'min_scale': 1} -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] eigenvalue_enabled ........... False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] eigenvalue_gas_boundary_resolution 1 -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] eigenvalue_layer_name ........ bert.encoder.layer -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] eigenvalue_layer_num ......... 0 -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] eigenvalue_max_iter .......... 100 -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] eigenvalue_stability ......... 1e-06 -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] eigenvalue_tol ............... 0.01 -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] eigenvalue_verbose ........... False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] elasticity_enabled ........... False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] flops_profiler_config ........ { - "enabled": false, - "profile_step": 1, - "module_depth": -1, - "top_modules": 1, - "detailed": true, - "output_file": null -} -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] fp16_enabled ................. True -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] fp16_mixed_quantize .......... False -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] global_rank .................. 0 -[2021-09-25 04:27:45,268] [INFO] [config.py:904:print] gradient_accumulation_steps .. 256 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] gradient_clipping ............ 1.0 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] gradient_predivide_factor .... 1.0 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] initial_dynamic_scale ........ 4096 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] loss_scale ................... 0 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] memory_breakdown ............. False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] optimizer_legacy_fusion ...... False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] optimizer_name ............... None -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] optimizer_params ............. None -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] pld_enabled .................. False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] pld_params ................... False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] prescale_gradients ........... False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_change_rate ......... 0.001 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_groups .............. 1 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_offset .............. 1000 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_period .............. 1000 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_rounding ............ 0 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_start_bits .......... 16 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_target_bits ......... 8 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_training_enabled .... False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_type ................ 0 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] quantize_verbose ............. False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] scheduler_name ............... None -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] scheduler_params ............. None -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] sparse_attention ............. None -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] sparse_gradients_enabled ..... False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] steps_per_print .............. 2000 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] tensorboard_enabled .......... False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] tensorboard_job_name ......... DeepSpeedJobName -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] tensorboard_output_path ...... -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] train_batch_size ............. 2048 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] train_micro_batch_size_per_gpu 1 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] use_quantizer_kernel ......... False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] wall_clock_breakdown ......... False -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] world_size ................... 8 -[2021-09-25 04:27:45,269] [INFO] [config.py:904:print] zero_allow_untested_optimizer False -[2021-09-25 04:27:45,270] [INFO] [config.py:904:print] zero_config .................. { - "stage": 1, - "contiguous_gradients": false, - "reduce_scatter": true, - "reduce_bucket_size": 5.000000e+08, - "allgather_partitions": true, - "allgather_bucket_size": 5.000000e+08, - "overlap_comm": false, - "load_from_fp32_weights": true, - "elastic_checkpoint": true, - "offload_param": null, - "offload_optimizer": null, - "sub_group_size": 1.000000e+09, - "prefetch_bucket_size": 5.000000e+07, - "param_persistence_threshold": 1.000000e+05, - "max_live_parameters": 1.000000e+09, - "max_reuse_distance": 1.000000e+09, - "gather_fp16_weights_on_model_save": false, - "ignore_unused_parameters": true, - "round_robin_gradients": false, - "legacy_stage1": false -} -[2021-09-25 04:27:45,270] [INFO] [config.py:904:print] zero_enabled ................. True -[2021-09-25 04:27:45,270] [INFO] [config.py:904:print] zero_optimization_stage ...... 1 -[2021-09-25 04:27:45,270] [INFO] [config.py:906:print] json = { - "train_micro_batch_size_per_gpu": 1, - "train_batch_size": 2.048000e+03, - "gradient_clipping": 1.0, - "zero_optimization": { - "stage": 1 - }, - "fp16": { - "enabled": true, - "loss_scale": 0, - "loss_scale_window": 500, - "hysteresis": 2, - "min_loss_scale": 1, - "initial_scale_power": 12 - }, - "steps_per_print": 2.000000e+03, - "wall_clock_breakdown": false -} -[2021-09-25 04:27:45,270] [INFO] [engine.py:76:__init__] CONFIG: micro_batches=256 micro_batch_size=1 -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=0 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=2 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=3 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=1 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=129 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=131 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=130 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=128 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=97 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=98 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=96 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=99 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=195 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=193 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=194 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=192 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=65 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=64 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=67 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=66 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=163 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=160 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=162 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=161 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=225 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=226 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=227 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=224 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=32 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=33 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=34 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-25 04:27:45,575] [INFO] [engine.py:134:__init__] RANK=35 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) - > using checkpoint value 6e-05 for learning rate - > using checkpoint value 6e-06 for minimum learning rate - > using checkpoint value 216320 for warmup iterations - > using checkpoint value 126953125 for total number of iterations - > using checkpoint value cosine for decay style -successfully loaded 8 ZeRO state_dicts for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 223 -successfully loaded 8 ZeRO state_dicts for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 49 -successfully loaded 8 ZeRO state_dicts for rank 142 -successfully loaded 8 ZeRO state_dicts for rank 221 -successfully loaded 8 ZeRO state_dicts for rank 75 -successfully loaded 8 ZeRO state_dicts for rank 50 -successfully loaded 8 ZeRO state_dicts for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 33 -successfully loaded 8 ZeRO state_dicts for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 216 -successfully loaded 8 ZeRO state_dicts for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 141 -successfully loaded 8 ZeRO state_dicts for rank 222 -successfully loaded 8 ZeRO state_dicts for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 58 -successfully loaded 8 ZeRO state_dicts for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 100 -successfully loaded 8 ZeRO state_dicts for rank 213 -successfully loaded 8 ZeRO state_dicts for rank 91 -successfully loaded 8 ZeRO state_dicts for rank 89 -successfully loaded 8 ZeRO state_dicts for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 35 -successfully loaded 8 ZeRO state_dicts for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 73 -successfully loaded 8 ZeRO state_dicts for rank 51 -successfully loaded 8 ZeRO state_dicts for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 146 -successfully loaded 8 ZeRO state_dicts for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 164 -successfully loaded 8 ZeRO state_dicts for rank 34 -successfully loaded 8 ZeRO state_dicts for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 131 -successfully loaded 8 ZeRO state_dicts for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 45 -successfully loaded 8 ZeRO state_dicts for rank 211 -successfully loaded 8 ZeRO state_dicts for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 154 -successfully loaded 8 ZeRO state_dicts for rank 88 -successfully loaded 8 ZeRO state_dicts for rank 87 -successfully loaded 8 ZeRO state_dicts for rank 90 -successfully loaded 8 ZeRO state_dicts for rank 84 -successfully loaded 8 ZeRO state_dicts for rank 53 -successfully loaded 8 ZeRO state_dicts for rank 116 -successfully loaded 8 ZeRO state_dicts for rank 59 -successfully loaded 8 ZeRO state_dicts for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 165 -successfully loaded 8 ZeRO state_dicts for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 210 -successfully loaded 8 ZeRO state_dicts for rank 129 -successfully loaded 8 ZeRO state_dicts for rank 93 -successfully loaded 8 ZeRO state_dicts for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 81 -successfully loaded 8 ZeRO state_dicts for rank 203 -successfully loaded 8 ZeRO state_dicts for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 38 -successfully loaded 8 ZeRO state_dicts for rank 160 -successfully loaded 8 ZeRO state_dicts for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 145 -successfully loaded 8 ZeRO state_dicts for rank 36 -successfully loaded 8 ZeRO state_dicts for rank 57 -successfully loaded 8 ZeRO state_dicts for rank 99 -successfully loaded 8 ZeRO state_dicts for rank 67 -successfully loaded 8 ZeRO state_dicts for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 47 -successfully loaded 8 ZeRO state_dicts for rank 147 -successfully loaded 8 ZeRO state_dicts for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 178 -successfully loaded 8 ZeRO state_dicts for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 161 -successfully loaded 8 ZeRO state_dicts for rank 219 -successfully loaded 8 ZeRO state_dicts for rank 120 -successfully loaded 8 ZeRO state_dicts for rank 56 -loading 8 zero partition checkpoints for rank 223 -successfully loaded 8 ZeRO state_dicts for rank 54 -successfully loaded 8 ZeRO state_dicts for rank 130 -successfully loaded 8 ZeRO state_dicts for rank 79 -successfully loaded 8 ZeRO state_dicts for rank 218 -successfully loaded 8 ZeRO state_dicts for rank 65 -successfully loaded 8 ZeRO state_dicts for rank 115 -successfully loaded 8 ZeRO state_dicts for rank 85 -loading 8 zero partition checkpoints for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 109 -successfully loaded 8 ZeRO state_dicts for rank 209 -successfully loaded 8 ZeRO state_dicts for rank 152 -successfully loaded 8 ZeRO state_dicts for rank 83 -successfully loaded 8 ZeRO state_dicts for rank 103 -successfully loaded 8 ZeRO state_dicts for rank 66 -successfully loaded 8 ZeRO state_dicts for rank 44 -successfully loaded 8 ZeRO state_dicts for rank 74 -successfully loaded 8 ZeRO state_dicts for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 86 -successfully loaded 8 ZeRO state_dicts for rank 151 -successfully loaded 8 ZeRO state_dicts for rank 171 -successfully loaded 8 ZeRO state_dicts for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 14 -successfully loaded 8 ZeRO state_dicts for rank 64 -successfully loaded 8 ZeRO state_dicts for rank 196 -successfully loaded 8 ZeRO state_dicts for rank 123 -successfully loaded 8 ZeRO state_dicts for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 181 -successfully loaded 8 ZeRO state_dicts for rank 55 -successfully loaded 8 ZeRO state_dicts for rank 228 -loading 8 zero partition checkpoints for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 170 -successfully loaded 8 ZeRO state_dicts for rank 208 -successfully loaded 8 ZeRO state_dicts for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 95 -successfully loaded 8 ZeRO state_dicts for rank 134 -successfully loaded 8 ZeRO state_dicts for rank 153 -successfully loaded 8 ZeRO state_dicts for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 111 -successfully loaded 8 ZeRO state_dicts for rank 133 -successfully loaded 8 ZeRO state_dicts for rank 149 -successfully loaded 8 ZeRO state_dicts for rank 194 -successfully loaded 8 ZeRO state_dicts for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 217 -successfully loaded 8 ZeRO state_dicts for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 114 -successfully loaded 8 ZeRO state_dicts for rank 200 -loading 8 zero partition checkpoints for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 202 -successfully loaded 8 ZeRO state_dicts for rank 138 -successfully loaded 8 ZeRO state_dicts for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 37 -successfully loaded 8 ZeRO state_dicts for rank 176 -successfully loaded 8 ZeRO state_dicts for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 98 -successfully loaded 8 ZeRO state_dicts for rank 101 -successfully loaded 8 ZeRO state_dicts for rank 39 -successfully loaded 8 ZeRO state_dicts for rank 107 -successfully loaded 8 ZeRO state_dicts for rank 42 -successfully loaded 8 ZeRO state_dicts for rank 8 -successfully loaded 8 ZeRO state_dicts for rank 186 -successfully loaded 8 ZeRO state_dicts for rank 94 -loading 8 zero partition checkpoints for rank 142 -successfully loaded 8 ZeRO state_dicts for rank 77 -successfully loaded 8 ZeRO state_dicts for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 199 -successfully loaded 8 ZeRO state_dicts for rank 43 -successfully loaded 8 ZeRO state_dicts for rank 69 -successfully loaded 8 ZeRO state_dicts for rank 205 -successfully loaded 8 ZeRO state_dicts for rank 167 -successfully loaded 8 ZeRO state_dicts for rank 41 -successfully loaded 8 ZeRO state_dicts for rank 80 -successfully loaded 8 ZeRO state_dicts for rank 119 -successfully loaded 8 ZeRO state_dicts for rank 106 -successfully loaded 8 ZeRO state_dicts for rank 187 -successfully loaded 8 ZeRO state_dicts for rank 197 -successfully loaded 8 ZeRO state_dicts for rank 92 -successfully loaded 8 ZeRO state_dicts for rank 236 -successfully loaded 8 ZeRO state_dicts for rank 97 -successfully loaded 8 ZeRO state_dicts for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 82 -successfully loaded 8 ZeRO state_dicts for rank 185 -successfully loaded 8 ZeRO state_dicts for rank 78 -successfully loaded 8 ZeRO state_dicts for rank 10 -successfully loaded 8 ZeRO state_dicts for rank 71 -successfully loaded 8 ZeRO state_dicts for rank 68 -successfully loaded 8 ZeRO state_dicts for rank 195 -successfully loaded 8 ZeRO state_dicts for rank 102 -successfully loaded 8 ZeRO state_dicts for rank 70 -successfully loaded 8 ZeRO state_dicts for rank 26 -successfully loaded 8 ZeRO state_dicts for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 117 -loading 8 zero partition checkpoints for rank 75 -successfully loaded 8 ZeRO state_dicts for rank 121 -successfully loaded 8 ZeRO state_dicts for rank 174 -successfully loaded 8 ZeRO state_dicts for rank 24 -loading 8 zero partition checkpoints for rank 50 -successfully loaded 8 ZeRO state_dicts for rank 179 -successfully loaded 8 ZeRO state_dicts for rank 248 -successfully loaded 8 ZeRO state_dicts for rank 46 -successfully loaded 8 ZeRO state_dicts for rank 12 -successfully loaded 8 ZeRO state_dicts for rank 126 -successfully loaded 8 ZeRO state_dicts for rank 169 -loading 8 zero partition checkpoints for rank 216 -loading 8 zero partition checkpoints for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 11 -successfully loaded 8 ZeRO state_dicts for rank 183 -successfully loaded 8 ZeRO state_dicts for rank 162 -loading 8 zero partition checkpoints for rank 222 -loading 8 zero partition checkpoints for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 182 -successfully loaded 8 ZeRO state_dicts for rank 27 -successfully loaded 8 ZeRO state_dicts for rank 252 -successfully loaded 8 ZeRO state_dicts for rank 224 -successfully loaded 8 ZeRO state_dicts for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 240 -successfully loaded 8 ZeRO state_dicts for rank 190 -loading 8 zero partition checkpoints for rank 141 -loading 8 zero partition checkpoints for rank 221 -successfully loaded 8 ZeRO state_dicts for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 231 -successfully loaded 8 ZeRO state_dicts for rank 175 -successfully loaded 8 ZeRO state_dicts for rank 122 -successfully loaded 8 ZeRO state_dicts for rank 13 -loading 8 zero partition checkpoints for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 110 -successfully loaded 8 ZeRO state_dicts for rank 233 -successfully loaded 8 ZeRO state_dicts for rank 118 -loading 8 zero partition checkpoints for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 198 -successfully loaded 8 ZeRO state_dicts for rank 30 -successfully loaded 8 ZeRO state_dicts for rank 163 -successfully loaded 8 ZeRO state_dicts for rank 244 -successfully loaded 8 ZeRO state_dicts for rank 16 -successfully loaded 8 ZeRO state_dicts for rank 18 -successfully loaded 8 ZeRO state_dicts for rank 250 -successfully loaded 8 ZeRO state_dicts for rank 2 -successfully loaded 8 ZeRO state_dicts for rank 25 -successfully loaded 8 ZeRO state_dicts for rank 230 -successfully loaded 8 ZeRO state_dicts for rank 235 -successfully loaded 8 ZeRO state_dicts for rank 31 -successfully loaded 8 ZeRO state_dicts for rank 177 -successfully loaded 8 ZeRO state_dicts for rank 28 -successfully loaded 8 ZeRO state_dicts for rank 238 -loading 8 zero partition checkpoints for rank 60 -loading 8 zero partition checkpoints for rank 144 -loading 8 zero partition checkpoints for rank 104 -loading 8 zero partition checkpoints for rank 213 -loading 8 zero partition checkpoints for rank 89 -loading 8 zero partition checkpoints for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 239 -loading 8 zero partition checkpoints for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 191 -loading 8 zero partition checkpoints for rank 91 -loading 8 zero partition checkpoints for rank 100 -successfully loaded 8 ZeRO state_dicts for rank 173 -successfully loaded 8 ZeRO state_dicts for rank 232 -successfully loaded 8 ZeRO state_dicts for rank 22 -loading 8 zero partition checkpoints for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 188 -successfully loaded 8 ZeRO state_dicts for rank 249 -successfully loaded 8 ZeRO state_dicts for rank 189 -successfully loaded 8 ZeRO state_dicts for rank 237 -successfully loaded 8 ZeRO state_dicts for rank 253 -successfully loaded 8 ZeRO state_dicts for rank 229 -successfully loaded 8 ZeRO state_dicts for rank 29 -successfully loaded 8 ZeRO state_dicts for rank 226 -successfully loaded 8 ZeRO state_dicts for rank 251 -loading 8 zero partition checkpoints for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 17 -successfully loaded 8 ZeRO state_dicts for rank 241 -loading 8 zero partition checkpoints for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 9 -successfully loaded 8 ZeRO state_dicts for rank 255 -successfully loaded 8 ZeRO state_dicts for rank 15 -successfully loaded 8 ZeRO state_dicts for rank 245 -loading 8 zero partition checkpoints for rank 211 -successfully loaded 8 ZeRO state_dicts for rank 246 -loading 8 zero partition checkpoints for rank 87 -successfully loaded 8 ZeRO state_dicts for rank 242 -successfully loaded 8 ZeRO state_dicts for rank 227 -successfully loaded 8 ZeRO state_dicts for rank 243 -successfully loaded 8 ZeRO state_dicts for rank 247 -loading 8 zero partition checkpoints for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 19 -loading 8 zero partition checkpoints for rank 116 -loading 8 zero partition checkpoints for rank 132 -loading 8 zero partition checkpoints for rank 88 -successfully loaded 8 ZeRO state_dicts for rank 20 -loading 8 zero partition checkpoints for rank 49 -loading 8 zero partition checkpoints for rank 128 -loading 8 zero partition checkpoints for rank 154 -loading 8 zero partition checkpoints for rank 165 -loading 8 zero partition checkpoints for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 254 -loading 8 zero partition checkpoints for rank 93 -successfully loaded 8 ZeRO state_dicts for rank 225 -loading 8 zero partition checkpoints for rank 81 -loading 8 zero partition checkpoints for rank 127 -loading 8 zero partition checkpoints for rank 76 -loading 8 zero partition checkpoints for rank 99 -loading 8 zero partition checkpoints for rank 57 -successfully loaded 8 ZeRO state_dicts for rank 0 -loading 8 zero partition checkpoints for rank 90 -loading 8 zero partition checkpoints for rank 73 -successfully loaded 8 ZeRO state_dicts for rank 1 -successfully loaded 8 ZeRO state_dicts for rank 234 -loading 8 zero partition checkpoints for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 3 -loading 8 zero partition checkpoints for rank 84 -loading 8 zero partition checkpoints for rank 113 -loading 8 zero partition checkpoints for rank 147 -loading 8 zero partition checkpoints for rank 219 -loading 8 zero partition checkpoints for rank 51 -loading 8 zero partition checkpoints for rank 72 -loading 8 zero partition checkpoints for rank 58 -loading 8 zero partition checkpoints for rank 160 -loading 8 zero partition checkpoints for rank 56 -loading 8 zero partition checkpoints for rank 158 -loading 8 zero partition checkpoints for rank 65 -loading 8 zero partition checkpoints for rank 130 -loading 8 zero partition checkpoints for rank 115 -loading 8 zero partition checkpoints for rank 67 -successfully loaded 8 ZeRO state_dicts for rank 21 -loading 8 zero partition checkpoints for rank 209 -loading 8 zero partition checkpoints for rank 109 -loading 8 zero partition checkpoints for rank 44 -loading 8 zero partition checkpoints for rank 74 -loading 8 zero partition checkpoints for rank 86 -loading 8 zero partition checkpoints for rank 45 -loading 8 zero partition checkpoints for rank 83 -loading 8 zero partition checkpoints for rank 171 -loading 8 zero partition checkpoints for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 23 -loading 8 zero partition checkpoints for rank 218 -loading 8 zero partition checkpoints for rank 159 -loading 8 zero partition checkpoints for rank 196 -loading 8 zero partition checkpoints for rank 66 -loading 8 zero partition checkpoints for rank 125 -loading 8 zero partition checkpoints for rank 111 -loading 8 zero partition checkpoints for rank 181 -loading 8 zero partition checkpoints for rank 151 -loading 8 zero partition checkpoints for rank 64 -loading 8 zero partition checkpoints for rank 134 -loading 8 zero partition checkpoints for rank 85 -loading 8 zero partition checkpoints for rank 206 -loading 8 zero partition checkpoints for rank 120 -loading 8 zero partition checkpoints for rank 37 -loading 8 zero partition checkpoints for rank 146 -loading 8 zero partition checkpoints for rank 95 -loading 8 zero partition checkpoints for rank 194 -loading 8 zero partition checkpoints for rank 202 -loading 8 zero partition checkpoints for rank 178 -loading 8 zero partition checkpoints for rank 138 -loading 8 zero partition checkpoints for rank 170 -loading 8 zero partition checkpoints for rank 55 -loading 8 zero partition checkpoints for rank 61 -loading 8 zero partition checkpoints for rank 101 -loading 8 zero partition checkpoints for rank 124 -loading 8 zero partition checkpoints for rank 135 -loading 8 zero partition checkpoints for rank 148 -loading 8 zero partition checkpoints for rank 139 -loading 8 zero partition checkpoints for rank 14 -loading 8 zero partition checkpoints for rank 77 -loading 8 zero partition checkpoints for rank 39 -loading 8 zero partition checkpoints for rank 152 -loading 8 zero partition checkpoints for rank 59 -loading 8 zero partition checkpoints for rank 80 -loading 8 zero partition checkpoints for rank 106 -loading 8 zero partition checkpoints for rank 69 -loading 8 zero partition checkpoints for rank 79 -loading 8 zero partition checkpoints for rank 47 -loading 8 zero partition checkpoints for rank 203 -loading 8 zero partition checkpoints for rank 94 -loading 8 zero partition checkpoints for rank 186 -loading 8 zero partition checkpoints for rank 217 -loading 8 zero partition checkpoints for rank 97 -loading 8 zero partition checkpoints for rank 92 -loading 8 zero partition checkpoints for rank 71 -loading 8 zero partition checkpoints for rank 164 -loading 8 zero partition checkpoints for rank 41 -loading 8 zero partition checkpoints for rank 103 -loading 8 zero partition checkpoints for rank 131 -loading 8 zero partition checkpoints for rank 197 -loading 8 zero partition checkpoints for rank 112 -loading 8 zero partition checkpoints for rank 145 -loading 8 zero partition checkpoints for rank 180 -loading 8 zero partition checkpoints for rank 70 -loading 8 zero partition checkpoints for rank 63 -loading 8 zero partition checkpoints for rank 123 -loading 8 zero partition checkpoints for rank 137 -loading 8 zero partition checkpoints for rank 82 -loading 8 zero partition checkpoints for rank 150 -loading 8 zero partition checkpoints for rank 68 -loading 8 zero partition checkpoints for rank 228 -loading 8 zero partition checkpoints for rank 187 -loading 8 zero partition checkpoints for rank 205 -loading 8 zero partition checkpoints for rank 8 -loading 8 zero partition checkpoints for rank 46 -loading 8 zero partition checkpoints for rank 117 -loading 8 zero partition checkpoints for rank 185 -loading 8 zero partition checkpoints for rank 183 -loading 8 zero partition checkpoints for rank 168 -loading 8 zero partition checkpoints for rank 133 -loading 8 zero partition checkpoints for rank 155 -loading 8 zero partition checkpoints for rank 176 -loading 8 zero partition checkpoints for rank 119 -loading 8 zero partition checkpoints for rank 153 -loading 8 zero partition checkpoints for rank 121 -loading 8 zero partition checkpoints for rank 42 -loading 8 zero partition checkpoints for rank 102 -loading 8 zero partition checkpoints for rank 96 -loading 8 zero partition checkpoints for rank 236 -loading 8 zero partition checkpoints for rank 201 -loading 8 zero partition checkpoints for rank 179 -loading 8 zero partition checkpoints for rank 162 -loading 8 zero partition checkpoints for rank 182 -loading 8 zero partition checkpoints for rank 43 -loading 8 zero partition checkpoints for rank 107 -loading 8 zero partition checkpoints for rank 129 -loading 8 zero partition checkpoints for rank 110 -loading 8 zero partition checkpoints for rank 38 -loading 8 zero partition checkpoints for rank 126 -loading 8 zero partition checkpoints for rank 105 -loading 8 zero partition checkpoints for rank 193 -loading 8 zero partition checkpoints for rank 118 -loading 8 zero partition checkpoints for rank 248 -loading 8 zero partition checkpoints for rank 114 -loading 8 zero partition checkpoints for rank 122 -loading 8 zero partition checkpoints for rank 200 -loading 8 zero partition checkpoints for rank 33 -loading 8 zero partition checkpoints for rank 177 -loading 8 zero partition checkpoints for rank 149 -loading 8 zero partition checkpoints for rank 36 -loading 8 zero partition checkpoints for rank 233 -loading 8 zero partition checkpoints for rank 53 -loading 8 zero partition checkpoints for rank 161 -loading 8 zero partition checkpoints for rank 12 -loading 8 zero partition checkpoints for rank 244 -loading 8 zero partition checkpoints for rank 78 -loading 8 zero partition checkpoints for rank 30 -loading 8 zero partition checkpoints for rank 98 -loading 8 zero partition checkpoints for rank 204 -loading 8 zero partition checkpoints for rank 16 -loading 8 zero partition checkpoints for rank 169 -loading 8 zero partition checkpoints for rank 28 -loading 8 zero partition checkpoints for rank 199 -loading 8 zero partition checkpoints for rank 230 -loading 8 zero partition checkpoints for rank 224 -loading 8 zero partition checkpoints for rank 35 -loading 8 zero partition checkpoints for rank 240 -loading 8 zero partition checkpoints for rank 167 -loading 8 zero partition checkpoints for rank 54 -loading 8 zero partition checkpoints for rank 210 -loading 8 zero partition checkpoints for rank 27 -loading 8 zero partition checkpoints for rank 10 -loading 8 zero partition checkpoints for rank 190 -loading 8 zero partition checkpoints for rank 192 -loading 8 zero partition checkpoints for rank 34 -loading 8 zero partition checkpoints for rank 252 -loading 8 zero partition checkpoints for rank 163 -loading 8 zero partition checkpoints for rank 13 -loading 8 zero partition checkpoints for rank 207 -loading 8 zero partition checkpoints for rank 191 -loading 8 zero partition checkpoints for rank 32 -loading 8 zero partition checkpoints for rank 231 -loading 8 zero partition checkpoints for rank 26 -loading 8 zero partition checkpoints for rank 9 -loading 8 zero partition checkpoints for rank 255 -loading 8 zero partition checkpoints for rank 11 -loading 8 zero partition checkpoints for rank 175 -loading 8 zero partition checkpoints for rank 241 -loading 8 zero partition checkpoints for rank 25 -loading 8 zero partition checkpoints for rank 189 -loading 8 zero partition checkpoints for rank 17 -loading 8 zero partition checkpoints for rank 24 -loading 8 zero partition checkpoints for rank 245 -loading 8 zero partition checkpoints for rank 208 -loading 8 zero partition checkpoints for rank 198 -loading 8 zero partition checkpoints for rank 254 -loading 8 zero partition checkpoints for rank 237 -loading 8 zero partition checkpoints for rank 188 -loading 8 zero partition checkpoints for rank 251 -loading 8 zero partition checkpoints for rank 225 -loading 8 zero partition checkpoints for rank 0 - checkpoint version 3.0 -loading 8 zero partition checkpoints for rank 253 -loading 8 zero partition checkpoints for rank 229 -loading 8 zero partition checkpoints for rank 250 -loading 8 zero partition checkpoints for rank 195 -loading 8 zero partition checkpoints for rank 173 -loading 8 zero partition checkpoints for rank 1 -loading 8 zero partition checkpoints for rank 234 -loading 8 zero partition checkpoints for rank 15 -loading 8 zero partition checkpoints for rank 239 -loading 8 zero partition checkpoints for rank 247 -loading 8 zero partition checkpoints for rank 246 -loading 8 zero partition checkpoints for rank 172 -loading 8 zero partition checkpoints for rank 249 -loading 8 zero partition checkpoints for rank 238 -loading 8 zero partition checkpoints for rank 31 -loading 8 zero partition checkpoints for rank 243 -loading 8 zero partition checkpoints for rank 242 -loading 8 zero partition checkpoints for rank 174 -loading 8 zero partition checkpoints for rank 226 -loading 8 zero partition checkpoints for rank 29 -loading 8 zero partition checkpoints for rank 18 -loading 8 zero partition checkpoints for rank 227 -loading 8 zero partition checkpoints for rank 19 -loading 8 zero partition checkpoints for rank 2 -loading 8 zero partition checkpoints for rank 235 -loading 8 zero partition checkpoints for rank 232 -loading 8 zero partition checkpoints for rank 3 -loading 8 zero partition checkpoints for rank 22 -loading 8 zero partition checkpoints for rank 20 -loading 8 zero partition checkpoints for rank 21 -loading 8 zero partition checkpoints for rank 23 -successfully loaded 8 ZeRO state_dicts for rank 6 -loading 8 zero partition checkpoints for rank 6 -successfully loaded 8 ZeRO state_dicts for rank 7 -successfully loaded 8 ZeRO state_dicts for rank 4 -successfully loaded 8 ZeRO state_dicts for rank 5 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-25 04:30:09 CEST)" was missed by 0:00:03.764782 -loading 8 zero partition checkpoints for rank 4 -loading 8 zero partition checkpoints for rank 7 -loading 8 zero partition checkpoints for rank 5 - successfully loaded checkpoint from /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints at iteration 6210 -time (ms) | load-checkpoint: 91691.46 -[after model, optimizer, and learning rate scheduler are built] datetime: 2021-09-25 04:29:17 -> building train, validation, and test datasets ... - > datasets target sizes (minimum size): - train: 300000000 - validation: 1638400 - test: 10240 -> building train, validation, and test datasets for GPT ... - > building dataset index ... - reading sizes... - reading pointers... - reading document index... - creating numpy buffer of mmap... - creating memory view of numpy buffer... - > finished creating indexed dataset in 0.138486 seconds - number of documents: 304230423 - > dataset split: - train: - document indices in [0, 288714672) total of 288714672 documents - validation: - document indices in [288714672, 303926193) total of 15211521 documents - test: - document indices in [303926193, 304230423) total of 304230 documents - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.350 seconds - total number of samples: 394611670 - total number of epochs: 3 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.276 seconds - total number of samples: 6927161 - total number of epochs: 1 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.080 seconds - total number of samples: 137384 - total number of epochs: 1 -> finished creating GPT datasets ... -[after dataloaders are built] datetime: 2021-09-25 04:29:23 -done with setup ... -training ... -time (ms) | model-and-optimizer-setup: 99723.96 | train/valid/test-data-iterators-setup: 5641.98 -[before the start of training step] datetime: 2021-09-25 04:29:23 -[2021-09-25 04:29:23,929] [INFO] [checkpointing.py:408:forward] Activation Checkpointing Information -[2021-09-25 04:29:23,930] [INFO] [checkpointing.py:409:forward] ----Partition Activations False, CPU CHECKPOINTING False -[2021-09-25 04:29:23,930] [INFO] [checkpointing.py:412:forward] ----contiguous Memory Checkpointing False with 32 total layers -[2021-09-25 04:29:23,930] [INFO] [checkpointing.py:415:forward] ----Synchronization False -[2021-09-25 04:29:23,930] [INFO] [checkpointing.py:416:forward] ----Profiling time in checkpointing False -[Rank 1] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13899.01416015625 | reserved: 23406.0 | max reserved: 23406.0 -[Rank 225] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11885.68994140625 | reserved: 21700.0 | max reserved: 21700.0 -[Rank 226] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11885.6884765625 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 2] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13899.01416015625 | reserved: 23406.0 | max reserved: 23406.0 -[Rank 0] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13899.01416015625 | reserved: 23726.0 | max reserved: 23726.0 -[Rank 224] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11885.68896484375 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 3] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13899.01416015625 | reserved: 23374.0 | max reserved: 23374.0 -[Rank 227] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11885.68994140625 | reserved: 22492.0 | max reserved: 22492.0 - iteration 6220/ 159576 | consumed samples: 194400 | elapsed time per iteration (ms): 18925.1 | learning rate: 5.378E-05 | global batch size: 80 | lm loss: 6.332304E+00 | loss scale: 4096.0 | grad norm: 207900.224 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[Rank 33] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20130.0 | max reserved: 20130.0 -[Rank 97] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19402.0 | max reserved: 19402.0 -[Rank 161] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 18826.0 | max reserved: 18826.0 -[Rank 193] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 18826.0 | max reserved: 18826.0 -[Rank 129] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19662.0 | max reserved: 19662.0 -[Rank 65] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 19946.0 | max reserved: 19946.0 -[Rank 34] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20170.0 | max reserved: 20170.0 -[Rank 162] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 18826.0 | max reserved: 18826.0 -[Rank 130] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19390.0 | max reserved: 19390.0 -[Rank 98] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19722.0 | max reserved: 19722.0 -[Rank 194] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 18826.0 | max reserved: 18826.0 -[Rank 66] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 20094.0 | max reserved: 20094.0 -[Rank 32] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20456.0 | max reserved: 20456.0 -[Rank 128] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19908.0 | max reserved: 19908.0 -[Rank 96] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19828.0 | max reserved: 19828.0 -[Rank 64] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 20328.0 | max reserved: 20328.0 -[Rank 192] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 19396.0 | max reserved: 19396.0 -[Rank 160] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 19572.0 | max reserved: 19572.0 -[Rank 99] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19662.0 | max reserved: 19662.0 -[Rank 67] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 19966.0 | max reserved: 19966.0 -[Rank 131] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19578.0 | max reserved: 19578.0 -[Rank 35] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20078.0 | max reserved: 20078.0 -[Rank 195] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 18842.0 | max reserved: 18842.0 -[Rank 163] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 19066.0 | max reserved: 19066.0 - iteration 6230/ 159576 | consumed samples: 195200 | elapsed time per iteration (ms): 17419.3 | learning rate: 5.400E-05 | global batch size: 80 | lm loss: 6.312761E+00 | loss scale: 4096.0 | grad norm: 102010.658 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6240/ 159576 | consumed samples: 196000 | elapsed time per iteration (ms): 17458.3 | learning rate: 5.423E-05 | global batch size: 80 | lm loss: 6.325917E+00 | loss scale: 4096.0 | grad norm: 139671.438 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6250/ 159576 | consumed samples: 196800 | elapsed time per iteration (ms): 17438.0 | learning rate: 5.445E-05 | global batch size: 80 | lm loss: 6.330989E+00 | loss scale: 4096.0 | grad norm: 117429.787 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6260/ 159576 | consumed samples: 197600 | elapsed time per iteration (ms): 17495.4 | learning rate: 5.467E-05 | global batch size: 80 | lm loss: 6.330341E+00 | loss scale: 4096.0 | grad norm: 101380.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6270/ 159576 | consumed samples: 198400 | elapsed time per iteration (ms): 17488.9 | learning rate: 5.489E-05 | global batch size: 80 | lm loss: 6.304220E+00 | loss scale: 4096.0 | grad norm: 137994.450 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6280/ 159576 | consumed samples: 199200 | elapsed time per iteration (ms): 17456.9 | learning rate: 5.511E-05 | global batch size: 80 | lm loss: 6.302861E+00 | loss scale: 4096.0 | grad norm: 117645.788 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6290/ 159576 | consumed samples: 200000 | elapsed time per iteration (ms): 16818.4 | learning rate: 5.531E-05 | global batch size: 80 | lm loss: 6.313686E+00 | loss scale: 4096.0 | grad norm: 87880.797 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6300/ 159576 | consumed samples: 200800 | elapsed time per iteration (ms): 17519.8 | learning rate: 5.554E-05 | global batch size: 80 | lm loss: 6.270583E+00 | loss scale: 4096.0 | grad norm: 86063.377 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6310/ 159576 | consumed samples: 201600 | elapsed time per iteration (ms): 17461.4 | learning rate: 5.576E-05 | global batch size: 80 | lm loss: 6.315401E+00 | loss scale: 4096.0 | grad norm: 120394.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6320/ 159576 | consumed samples: 202400 | elapsed time per iteration (ms): 17455.8 | learning rate: 5.598E-05 | global batch size: 80 | lm loss: 6.326277E+00 | loss scale: 4096.0 | grad norm: 95784.457 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6330/ 159576 | consumed samples: 203200 | elapsed time per iteration (ms): 17431.8 | learning rate: 5.620E-05 | global batch size: 80 | lm loss: 6.333566E+00 | loss scale: 4096.0 | grad norm: 119951.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6340/ 159576 | consumed samples: 204000 | elapsed time per iteration (ms): 16668.3 | learning rate: 5.640E-05 | global batch size: 80 | lm loss: 6.321040E+00 | loss scale: 2048.0 | grad norm: 54351.143 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 05:08:29] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 05:08:29] PULSE: tr8-104B is running for 41:28 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 6350/ 159576 | consumed samples: 204800 | elapsed time per iteration (ms): 17330.6 | learning rate: 5.662E-05 | global batch size: 80 | lm loss: 6.297153E+00 | loss scale: 2048.0 | grad norm: 61555.753 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6360/ 159576 | consumed samples: 205600 | elapsed time per iteration (ms): 17390.9 | learning rate: 5.684E-05 | global batch size: 80 | lm loss: 6.296333E+00 | loss scale: 2048.0 | grad norm: 67211.747 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6370/ 159576 | consumed samples: 206400 | elapsed time per iteration (ms): 17338.2 | learning rate: 5.707E-05 | global batch size: 80 | lm loss: 6.309451E+00 | loss scale: 2048.0 | grad norm: 66671.395 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6380/ 159576 | consumed samples: 207200 | elapsed time per iteration (ms): 17380.7 | learning rate: 5.729E-05 | global batch size: 80 | lm loss: 6.301356E+00 | loss scale: 2048.0 | grad norm: 45299.990 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6390/ 159576 | consumed samples: 208000 | elapsed time per iteration (ms): 17366.7 | learning rate: 5.751E-05 | global batch size: 80 | lm loss: 6.335297E+00 | loss scale: 2048.0 | grad norm: 59836.646 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6400/ 159576 | consumed samples: 208800 | elapsed time per iteration (ms): 17383.7 | learning rate: 5.773E-05 | global batch size: 80 | lm loss: 6.303946E+00 | loss scale: 2048.0 | grad norm: 55594.564 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6410/ 159576 | consumed samples: 209600 | elapsed time per iteration (ms): 17402.0 | learning rate: 5.795E-05 | global batch size: 80 | lm loss: 6.335719E+00 | loss scale: 2048.0 | grad norm: 63504.303 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6420/ 159576 | consumed samples: 210400 | elapsed time per iteration (ms): 17371.7 | learning rate: 5.818E-05 | global batch size: 80 | lm loss: 6.278386E+00 | loss scale: 2048.0 | grad norm: 252963.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6430/ 159576 | consumed samples: 211200 | elapsed time per iteration (ms): 17394.4 | learning rate: 5.840E-05 | global batch size: 80 | lm loss: 6.309026E+00 | loss scale: 2048.0 | grad norm: 70987.021 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6440/ 159576 | consumed samples: 212000 | elapsed time per iteration (ms): 17385.8 | learning rate: 5.862E-05 | global batch size: 80 | lm loss: 6.352011E+00 | loss scale: 2048.0 | grad norm: 57730.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6450/ 159576 | consumed samples: 212800 | elapsed time per iteration (ms): 17363.4 | learning rate: 5.884E-05 | global batch size: 80 | lm loss: 6.338916E+00 | loss scale: 2048.0 | grad norm: 74089.414 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6460/ 159576 | consumed samples: 213600 | elapsed time per iteration (ms): 17402.1 | learning rate: 5.906E-05 | global batch size: 80 | lm loss: 6.307239E+00 | loss scale: 2048.0 | grad norm: 43748.712 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6470/ 159576 | consumed samples: 214400 | elapsed time per iteration (ms): 17495.0 | learning rate: 5.929E-05 | global batch size: 80 | lm loss: 6.336151E+00 | loss scale: 2048.0 | grad norm: 39508.293 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6480/ 159576 | consumed samples: 215200 | elapsed time per iteration (ms): 17462.6 | learning rate: 5.951E-05 | global batch size: 80 | lm loss: 6.356039E+00 | loss scale: 2048.0 | grad norm: 37602.564 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6490/ 159576 | consumed samples: 216000 | elapsed time per iteration (ms): 17419.0 | learning rate: 5.973E-05 | global batch size: 80 | lm loss: 6.355389E+00 | loss scale: 2048.0 | grad norm: 44833.008 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6500/ 159576 | consumed samples: 216800 | elapsed time per iteration (ms): 17489.2 | learning rate: 5.995E-05 | global batch size: 80 | lm loss: 6.336482E+00 | loss scale: 2048.0 | grad norm: 54162.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6510/ 159576 | consumed samples: 217600 | elapsed time per iteration (ms): 17458.7 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.337574E+00 | loss scale: 2048.0 | grad norm: 54595.463 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6520/ 159576 | consumed samples: 218400 | elapsed time per iteration (ms): 17515.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.356417E+00 | loss scale: 2048.0 | grad norm: 49879.304 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6530/ 159576 | consumed samples: 219200 | elapsed time per iteration (ms): 17447.6 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.369381E+00 | loss scale: 2048.0 | grad norm: 60963.731 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6540/ 159576 | consumed samples: 220000 | elapsed time per iteration (ms): 17448.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.338880E+00 | loss scale: 2048.0 | grad norm: 59382.431 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6550/ 159576 | consumed samples: 220800 | elapsed time per iteration (ms): 17544.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.331310E+00 | loss scale: 2048.0 | grad norm: 62265.638 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 06:08:34] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 06:08:34] PULSE: tr8-104B is running for 1:41:33 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 6560/ 159576 | consumed samples: 221600 | elapsed time per iteration (ms): 17470.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.312242E+00 | loss scale: 2048.0 | grad norm: 58830.808 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6570/ 159576 | consumed samples: 222400 | elapsed time per iteration (ms): 17497.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.305868E+00 | loss scale: 2048.0 | grad norm: 95845.470 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6580/ 159576 | consumed samples: 223200 | elapsed time per iteration (ms): 17465.4 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.323441E+00 | loss scale: 2048.0 | grad norm: 67257.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6590/ 159576 | consumed samples: 224000 | elapsed time per iteration (ms): 17539.4 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.324122E+00 | loss scale: 2048.0 | grad norm: 68019.685 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6600/ 159576 | consumed samples: 224800 | elapsed time per iteration (ms): 17523.7 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.367977E+00 | loss scale: 2048.0 | grad norm: 72056.426 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6610/ 159576 | consumed samples: 225600 | elapsed time per iteration (ms): 17492.9 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.308113E+00 | loss scale: 2048.0 | grad norm: 149731.321 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6620/ 159576 | consumed samples: 226400 | elapsed time per iteration (ms): 17537.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.354418E+00 | loss scale: 2048.0 | grad norm: 62412.313 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6630/ 159576 | consumed samples: 227200 | elapsed time per iteration (ms): 17517.5 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.357222E+00 | loss scale: 2048.0 | grad norm: 85289.584 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6640/ 159576 | consumed samples: 228000 | elapsed time per iteration (ms): 17515.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.340989E+00 | loss scale: 2048.0 | grad norm: 56974.928 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6650/ 159576 | consumed samples: 228800 | elapsed time per iteration (ms): 17504.4 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.343948E+00 | loss scale: 2048.0 | grad norm: 94205.551 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6660/ 159576 | consumed samples: 229600 | elapsed time per iteration (ms): 17528.5 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.349052E+00 | loss scale: 2048.0 | grad norm: 59116.810 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6670/ 159576 | consumed samples: 230400 | elapsed time per iteration (ms): 17539.0 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.319823E+00 | loss scale: 2048.0 | grad norm: 89145.444 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6680/ 159576 | consumed samples: 231200 | elapsed time per iteration (ms): 17492.6 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.322467E+00 | loss scale: 2048.0 | grad norm: 79513.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6690/ 159576 | consumed samples: 232000 | elapsed time per iteration (ms): 17427.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.351400E+00 | loss scale: 2048.0 | grad norm: 80270.152 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6700/ 159576 | consumed samples: 232800 | elapsed time per iteration (ms): 17427.9 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.321815E+00 | loss scale: 2048.0 | grad norm: 89875.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6710/ 159576 | consumed samples: 233600 | elapsed time per iteration (ms): 17478.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.318744E+00 | loss scale: 2048.0 | grad norm: 75317.404 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 06:55:50] PULSE: tr8-104B is scheduled to start in 1 day, 10:16:13 (at 2021-09-26T17:12:04) (1188168 on 'gpu_p13' partition) -[2021-09-25 06:55:50] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 06:55:50] PULSE: tr8-104B is running for 2:28:49 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 6720/ 159576 | consumed samples: 234400 | elapsed time per iteration (ms): 17509.5 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.297193E+00 | loss scale: 2048.0 | grad norm: 136372.702 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6730/ 159576 | consumed samples: 235200 | elapsed time per iteration (ms): 17514.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.303332E+00 | loss scale: 2048.0 | grad norm: 84302.661 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6740/ 159576 | consumed samples: 236000 | elapsed time per iteration (ms): 17530.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.327809E+00 | loss scale: 2048.0 | grad norm: 84736.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6750/ 159576 | consumed samples: 236912 | elapsed time per iteration (ms): 18323.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.320579E+00 | loss scale: 2048.0 | grad norm: 68855.991 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 07:08:59] PULSE: tr8-104B is scheduled to start in 19:13:17 (at 2021-09-26T02:22:17) (1188168 on 'gpu_p13' partition) -[2021-09-25 07:08:59] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 07:08:59] PULSE: tr8-104B is running for 2:41:58 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 6760/ 159576 | consumed samples: 237872 | elapsed time per iteration (ms): 18776.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.303013E+00 | loss scale: 2048.0 | grad norm: 69740.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6770/ 159576 | consumed samples: 238832 | elapsed time per iteration (ms): 18675.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.319376E+00 | loss scale: 2048.0 | grad norm: 83900.872 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6780/ 159576 | consumed samples: 239792 | elapsed time per iteration (ms): 18605.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.336406E+00 | loss scale: 2048.0 | grad norm: 62443.554 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6790/ 159576 | consumed samples: 240752 | elapsed time per iteration (ms): 18746.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.333478E+00 | loss scale: 2048.0 | grad norm: 73606.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6800/ 159576 | consumed samples: 241712 | elapsed time per iteration (ms): 18688.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.336754E+00 | loss scale: 2048.0 | grad norm: 96323.491 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6810/ 159576 | consumed samples: 242672 | elapsed time per iteration (ms): 18568.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.315503E+00 | loss scale: 2048.0 | grad norm: 65008.365 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6820/ 159576 | consumed samples: 243632 | elapsed time per iteration (ms): 18731.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.301308E+00 | loss scale: 2048.0 | grad norm: 70887.665 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6830/ 159576 | consumed samples: 244592 | elapsed time per iteration (ms): 18612.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.331754E+00 | loss scale: 2048.0 | grad norm: 78393.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6840/ 159576 | consumed samples: 245552 | elapsed time per iteration (ms): 18584.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.318947E+00 | loss scale: 4096.0 | grad norm: 175812.475 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6850/ 159576 | consumed samples: 246512 | elapsed time per iteration (ms): 18855.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.349559E+00 | loss scale: 4096.0 | grad norm: 150858.899 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6860/ 159576 | consumed samples: 247472 | elapsed time per iteration (ms): 18778.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.341676E+00 | loss scale: 4096.0 | grad norm: 374400.560 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6870/ 159576 | consumed samples: 248432 | elapsed time per iteration (ms): 18648.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.313033E+00 | loss scale: 4096.0 | grad norm: 153615.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6880/ 159576 | consumed samples: 249392 | elapsed time per iteration (ms): 18783.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.332200E+00 | loss scale: 4096.0 | grad norm: 135045.488 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6890/ 159576 | consumed samples: 250352 | elapsed time per iteration (ms): 18757.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.370442E+00 | loss scale: 4096.0 | grad norm: 140003.151 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6900/ 159576 | consumed samples: 251312 | elapsed time per iteration (ms): 18547.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.426891E+00 | loss scale: 4096.0 | grad norm: 166603.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6910/ 159576 | consumed samples: 252272 | elapsed time per iteration (ms): 18775.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.383529E+00 | loss scale: 4096.0 | grad norm: 161102.692 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6920/ 159576 | consumed samples: 253232 | elapsed time per iteration (ms): 18674.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.362777E+00 | loss scale: 4096.0 | grad norm: 135239.756 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6930/ 159576 | consumed samples: 254192 | elapsed time per iteration (ms): 18723.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.348313E+00 | loss scale: 4096.0 | grad norm: 180298.634 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6940/ 159576 | consumed samples: 255152 | elapsed time per iteration (ms): 18629.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.304693E+00 | loss scale: 4096.0 | grad norm: 155481.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6950/ 159576 | consumed samples: 256112 | elapsed time per iteration (ms): 18736.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.335081E+00 | loss scale: 4096.0 | grad norm: 170157.683 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 08:09:15] PULSE: tr8-104B is scheduled to start in 18:13:01 (at 2021-09-26T02:22:17) (1188168 on 'gpu_p13' partition) -[2021-09-25 08:09:15] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 08:09:15] PULSE: tr8-104B is running for 3:42:14 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 6960/ 159576 | consumed samples: 257072 | elapsed time per iteration (ms): 18679.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.350162E+00 | loss scale: 4096.0 | grad norm: 146048.789 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6970/ 159576 | consumed samples: 258032 | elapsed time per iteration (ms): 17405.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.358824E+00 | loss scale: 2048.0 | grad norm: 83822.155 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6980/ 159576 | consumed samples: 258992 | elapsed time per iteration (ms): 18714.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.327154E+00 | loss scale: 2048.0 | grad norm: 55012.450 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6990/ 159576 | consumed samples: 259952 | elapsed time per iteration (ms): 18649.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.344659E+00 | loss scale: 2048.0 | grad norm: 62132.618 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7000/ 159576 | consumed samples: 260912 | elapsed time per iteration (ms): 18706.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.444662E+00 | loss scale: 2048.0 | grad norm: 98258.265 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 7000 | lm loss value: 7.174200E+00 | lm loss PPL: 1.305315E+03 | ------------------------------------------------------------------------------------------------- - iteration 7010/ 159576 | consumed samples: 261872 | elapsed time per iteration (ms): 19904.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 1.142026E+01 | loss scale: 2048.0 | grad norm: 219645.978 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7020/ 159576 | consumed samples: 262832 | elapsed time per iteration (ms): 18580.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 1.367010E+01 | loss scale: 2048.0 | grad norm: 223286.170 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 08:32:28] PULSE: tr8-104B is scheduled to start in 17:49:48 (at 2021-09-26T02:22:17) (1188168 on 'gpu_p13' partition) -[2021-09-25 08:32:28] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 08:32:28] PULSE: tr8-104B is running for 4:05:27 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 7030/ 159576 | consumed samples: 263792 | elapsed time per iteration (ms): 18402.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 1.182180E+01 | loss scale: 2048.0 | grad norm: 19931.456 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7040/ 159576 | consumed samples: 264752 | elapsed time per iteration (ms): 18461.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 9.981701E+00 | loss scale: 2048.0 | grad norm: 205737.088 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7050/ 159576 | consumed samples: 265712 | elapsed time per iteration (ms): 18431.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 9.425107E+00 | loss scale: 2048.0 | grad norm: 195793.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7060/ 159576 | consumed samples: 266672 | elapsed time per iteration (ms): 18498.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 8.606621E+00 | loss scale: 2048.0 | grad norm: 50379.603 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7070/ 159576 | consumed samples: 267632 | elapsed time per iteration (ms): 18340.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 8.027315E+00 | loss scale: 2048.0 | grad norm: 37173.058 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7080/ 159576 | consumed samples: 268592 | elapsed time per iteration (ms): 18563.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.726066E+00 | loss scale: 2048.0 | grad norm: 22946.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7090/ 159576 | consumed samples: 269552 | elapsed time per iteration (ms): 18408.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.553810E+00 | loss scale: 2048.0 | grad norm: 16048.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7100/ 159576 | consumed samples: 270512 | elapsed time per iteration (ms): 18353.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.394469E+00 | loss scale: 2048.0 | grad norm: 10766.157 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 08:57:55] PULSE: tr8-104B is scheduled to start in 17:24:21 (at 2021-09-26T02:22:17) (1188168 on 'gpu_p13' partition) -[2021-09-25 08:57:55] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 08:57:55] PULSE: tr8-104B is running for 4:30:54 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 7110/ 159576 | consumed samples: 271472 | elapsed time per iteration (ms): 18511.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.327065E+00 | loss scale: 2048.0 | grad norm: 25940.869 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7120/ 159576 | consumed samples: 272432 | elapsed time per iteration (ms): 18333.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.337917E+00 | loss scale: 2048.0 | grad norm: 18319.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7130/ 159576 | consumed samples: 273392 | elapsed time per iteration (ms): 18249.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.273988E+00 | loss scale: 2048.0 | grad norm: 14331.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7140/ 159576 | consumed samples: 274352 | elapsed time per iteration (ms): 18274.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.204887E+00 | loss scale: 2048.0 | grad norm: 21767.712 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 09:09:21] PULSE: tr8-104B is scheduled to start in 17:12:55 (at 2021-09-26T02:22:17) (1188168 on 'gpu_p13' partition) -[2021-09-25 09:09:21] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 09:09:21] PULSE: tr8-104B is running for 4:42:20 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 7150/ 159576 | consumed samples: 275312 | elapsed time per iteration (ms): 18318.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.195872E+00 | loss scale: 2048.0 | grad norm: 14010.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7160/ 159576 | consumed samples: 276272 | elapsed time per iteration (ms): 18337.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.136990E+00 | loss scale: 2048.0 | grad norm: 23189.415 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7170/ 159576 | consumed samples: 277232 | elapsed time per iteration (ms): 18344.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.222323E+00 | loss scale: 2048.0 | grad norm: 22610.297 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7180/ 159576 | consumed samples: 278192 | elapsed time per iteration (ms): 18312.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.156533E+00 | loss scale: 2048.0 | grad norm: 12376.987 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7190/ 159576 | consumed samples: 279152 | elapsed time per iteration (ms): 18417.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.084262E+00 | loss scale: 2048.0 | grad norm: 38647.390 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7200/ 159576 | consumed samples: 280112 | elapsed time per iteration (ms): 18396.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.110893E+00 | loss scale: 2048.0 | grad norm: 21520.416 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7210/ 159576 | consumed samples: 281072 | elapsed time per iteration (ms): 18408.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.294872E+00 | loss scale: 2048.0 | grad norm: 77171.242 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7220/ 159576 | consumed samples: 282032 | elapsed time per iteration (ms): 18333.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.155109E+00 | loss scale: 2048.0 | grad norm: 16921.991 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7230/ 159576 | consumed samples: 282992 | elapsed time per iteration (ms): 18398.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.042103E+00 | loss scale: 2048.0 | grad norm: 13510.423 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7240/ 159576 | consumed samples: 284032 | elapsed time per iteration (ms): 19100.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.964984E+00 | loss scale: 2048.0 | grad norm: 11355.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7250/ 159576 | consumed samples: 285152 | elapsed time per iteration (ms): 19781.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.051522E+00 | loss scale: 2048.0 | grad norm: 14836.710 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7260/ 159576 | consumed samples: 286272 | elapsed time per iteration (ms): 19836.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.050404E+00 | loss scale: 2048.0 | grad norm: 32092.591 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7270/ 159576 | consumed samples: 287392 | elapsed time per iteration (ms): 19719.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.034865E+00 | loss scale: 2048.0 | grad norm: 25809.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7280/ 159576 | consumed samples: 288512 | elapsed time per iteration (ms): 19632.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.038512E+00 | loss scale: 2048.0 | grad norm: 19816.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7290/ 159576 | consumed samples: 289632 | elapsed time per iteration (ms): 19704.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.051814E+00 | loss scale: 2048.0 | grad norm: 13138.906 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7300/ 159576 | consumed samples: 290752 | elapsed time per iteration (ms): 19431.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.962708E+00 | loss scale: 2048.0 | grad norm: 15505.241 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7310/ 159576 | consumed samples: 291872 | elapsed time per iteration (ms): 19625.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.068867E+00 | loss scale: 2048.0 | grad norm: 26542.834 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7320/ 159576 | consumed samples: 292992 | elapsed time per iteration (ms): 19705.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.131171E+00 | loss scale: 2048.0 | grad norm: 59185.721 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7330/ 159576 | consumed samples: 294112 | elapsed time per iteration (ms): 19592.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.030576E+00 | loss scale: 2048.0 | grad norm: 32033.660 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 10:09:39] PULSE: tr8-104B is scheduled to start in 17:07:05 (at 2021-09-26T03:16:45) (1188168 on 'gpu_p13' partition) -[2021-09-25 10:09:39] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 10:09:39] PULSE: tr8-104B is running for 5:42:38 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 7340/ 159576 | consumed samples: 295232 | elapsed time per iteration (ms): 19566.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.981178E+00 | loss scale: 2048.0 | grad norm: 29317.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7350/ 159576 | consumed samples: 296352 | elapsed time per iteration (ms): 19494.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.969751E+00 | loss scale: 2048.0 | grad norm: 20774.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7360/ 159576 | consumed samples: 297472 | elapsed time per iteration (ms): 19789.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.939532E+00 | loss scale: 2048.0 | grad norm: 22939.531 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7370/ 159576 | consumed samples: 298592 | elapsed time per iteration (ms): 19854.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.888672E+00 | loss scale: 2048.0 | grad norm: 30762.881 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7380/ 159576 | consumed samples: 299712 | elapsed time per iteration (ms): 19888.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.906486E+00 | loss scale: 2048.0 | grad norm: 18438.642 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7390/ 159576 | consumed samples: 300832 | elapsed time per iteration (ms): 19703.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.877617E+00 | loss scale: 2048.0 | grad norm: 15185.355 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7400/ 159576 | consumed samples: 301952 | elapsed time per iteration (ms): 19654.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.854189E+00 | loss scale: 2048.0 | grad norm: 15960.831 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7410/ 159576 | consumed samples: 303072 | elapsed time per iteration (ms): 19528.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.894382E+00 | loss scale: 2048.0 | grad norm: 12842.484 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7420/ 159576 | consumed samples: 304192 | elapsed time per iteration (ms): 19701.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.860787E+00 | loss scale: 2048.0 | grad norm: 15167.024 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7430/ 159576 | consumed samples: 305312 | elapsed time per iteration (ms): 19702.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.859363E+00 | loss scale: 2048.0 | grad norm: 23062.497 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7440/ 159576 | consumed samples: 306432 | elapsed time per iteration (ms): 19933.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.860333E+00 | loss scale: 2048.0 | grad norm: 32840.662 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7450/ 159576 | consumed samples: 307552 | elapsed time per iteration (ms): 19857.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.824039E+00 | loss scale: 2048.0 | grad norm: 14512.315 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7460/ 159576 | consumed samples: 308672 | elapsed time per iteration (ms): 19438.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.828743E+00 | loss scale: 2048.0 | grad norm: 22065.697 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7470/ 159576 | consumed samples: 309792 | elapsed time per iteration (ms): 19647.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.799754E+00 | loss scale: 4096.0 | grad norm: 49640.058 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7480/ 159576 | consumed samples: 310912 | elapsed time per iteration (ms): 19818.5 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.815539E+00 | loss scale: 4096.0 | grad norm: 22148.104 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7490/ 159576 | consumed samples: 312032 | elapsed time per iteration (ms): 19788.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.894387E+00 | loss scale: 4096.0 | grad norm: 36912.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7500/ 159576 | consumed samples: 313152 | elapsed time per iteration (ms): 19799.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.841101E+00 | loss scale: 4096.0 | grad norm: 23983.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 7500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-25 11:03:46,249] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step7500/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 7500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 18021.67 - iteration 7510/ 159576 | consumed samples: 314272 | elapsed time per iteration (ms): 21444.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.821138E+00 | loss scale: 4096.0 | grad norm: 27340.598 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 11:09:42] PULSE: tr8-104B is scheduled to start in 17:10:43 (at 2021-09-26T04:20:26) (1188168 on 'gpu_p13' partition) -[2021-09-25 11:09:42] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 11:09:42] PULSE: tr8-104B is running for 6:42:41 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 7520/ 159576 | consumed samples: 315392 | elapsed time per iteration (ms): 19669.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.839085E+00 | loss scale: 4096.0 | grad norm: 27168.782 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7530/ 159576 | consumed samples: 316512 | elapsed time per iteration (ms): 19673.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.866766E+00 | loss scale: 4096.0 | grad norm: 35661.716 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7540/ 159576 | consumed samples: 317632 | elapsed time per iteration (ms): 19547.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.895227E+00 | loss scale: 4096.0 | grad norm: 30950.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7550/ 159576 | consumed samples: 318752 | elapsed time per iteration (ms): 19728.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.974333E+00 | loss scale: 4096.0 | grad norm: 58146.349 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7560/ 159576 | consumed samples: 319872 | elapsed time per iteration (ms): 19670.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.993269E+00 | loss scale: 4096.0 | grad norm: 59358.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7570/ 159576 | consumed samples: 320992 | elapsed time per iteration (ms): 19932.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.018776E+00 | loss scale: 4096.0 | grad norm: 26693.574 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7580/ 159576 | consumed samples: 322112 | elapsed time per iteration (ms): 19801.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.954316E+00 | loss scale: 4096.0 | grad norm: 56910.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7590/ 159576 | consumed samples: 323232 | elapsed time per iteration (ms): 19757.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.019042E+00 | loss scale: 4096.0 | grad norm: 31511.156 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7600/ 159576 | consumed samples: 324352 | elapsed time per iteration (ms): 19717.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.002568E+00 | loss scale: 4096.0 | grad norm: 35214.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7610/ 159576 | consumed samples: 325472 | elapsed time per iteration (ms): 19801.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.968073E+00 | loss scale: 4096.0 | grad norm: 40886.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7620/ 159576 | consumed samples: 326592 | elapsed time per iteration (ms): 19491.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.959355E+00 | loss scale: 4096.0 | grad norm: 37865.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7630/ 159576 | consumed samples: 327712 | elapsed time per iteration (ms): 19606.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.927076E+00 | loss scale: 4096.0 | grad norm: 32908.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7640/ 159576 | consumed samples: 328832 | elapsed time per iteration (ms): 19669.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.079063E+00 | loss scale: 4096.0 | grad norm: 43561.929 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7650/ 159576 | consumed samples: 329952 | elapsed time per iteration (ms): 19813.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.977676E+00 | loss scale: 4096.0 | grad norm: 33954.223 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7660/ 159576 | consumed samples: 331120 | elapsed time per iteration (ms): 20182.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.071407E+00 | loss scale: 4096.0 | grad norm: 139629.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7670/ 159576 | consumed samples: 332400 | elapsed time per iteration (ms): 20921.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.133433E+00 | loss scale: 4096.0 | grad norm: 151598.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7680/ 159576 | consumed samples: 333680 | elapsed time per iteration (ms): 20923.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.093058E+00 | loss scale: 4096.0 | grad norm: 75854.068 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7690/ 159576 | consumed samples: 334960 | elapsed time per iteration (ms): 20468.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.040206E+00 | loss scale: 4096.0 | grad norm: 68735.463 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 12:10:01] PULSE: tr8-104B is scheduled to start in 18:54:29 (at 2021-09-26T07:04:31) (1188168 on 'gpu_p13' partition) -[2021-09-25 12:10:01] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 12:10:01] PULSE: tr8-104B is running for 7:43:00 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 7700/ 159576 | consumed samples: 336240 | elapsed time per iteration (ms): 20712.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.991071E+00 | loss scale: 4096.0 | grad norm: 49058.974 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7710/ 159576 | consumed samples: 337520 | elapsed time per iteration (ms): 20803.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.999660E+00 | loss scale: 4096.0 | grad norm: 50810.796 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7720/ 159576 | consumed samples: 338800 | elapsed time per iteration (ms): 21027.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.148920E+00 | loss scale: 4096.0 | grad norm: 34526.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7730/ 159576 | consumed samples: 340080 | elapsed time per iteration (ms): 20621.1 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.952879E+00 | loss scale: 4096.0 | grad norm: 46587.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7740/ 159576 | consumed samples: 341360 | elapsed time per iteration (ms): 20787.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.077150E+00 | loss scale: 4096.0 | grad norm: 53834.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7750/ 159576 | consumed samples: 342640 | elapsed time per iteration (ms): 20790.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.024051E+00 | loss scale: 4096.0 | grad norm: 108296.631 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7760/ 159576 | consumed samples: 343920 | elapsed time per iteration (ms): 20756.3 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.185934E+00 | loss scale: 4096.0 | grad norm: 40243.918 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7770/ 159576 | consumed samples: 345200 | elapsed time per iteration (ms): 20678.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.155985E+00 | loss scale: 4096.0 | grad norm: 45818.733 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7780/ 159576 | consumed samples: 346480 | elapsed time per iteration (ms): 20656.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.028696E+00 | loss scale: 4096.0 | grad norm: 54814.681 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7790/ 159576 | consumed samples: 347760 | elapsed time per iteration (ms): 20773.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.962093E+00 | loss scale: 4096.0 | grad norm: 57105.334 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7800/ 159576 | consumed samples: 349040 | elapsed time per iteration (ms): 20735.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.054767E+00 | loss scale: 4096.0 | grad norm: 74767.367 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7810/ 159576 | consumed samples: 350320 | elapsed time per iteration (ms): 20748.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.948767E+00 | loss scale: 4096.0 | grad norm: 103822.696 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7820/ 159576 | consumed samples: 351600 | elapsed time per iteration (ms): 20609.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.995116E+00 | loss scale: 4096.0 | grad norm: 70594.913 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7830/ 159576 | consumed samples: 352880 | elapsed time per iteration (ms): 20891.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.140380E+00 | loss scale: 4096.0 | grad norm: 50257.684 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7840/ 159576 | consumed samples: 354160 | elapsed time per iteration (ms): 20736.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.051595E+00 | loss scale: 4096.0 | grad norm: 62967.110 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7850/ 159576 | consumed samples: 355440 | elapsed time per iteration (ms): 20790.1 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.921895E+00 | loss scale: 4096.0 | grad norm: 104168.914 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7860/ 159576 | consumed samples: 356720 | elapsed time per iteration (ms): 20774.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.071528E+00 | loss scale: 4096.0 | grad norm: 193610.451 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7870/ 159576 | consumed samples: 358000 | elapsed time per iteration (ms): 20837.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.086633E+00 | loss scale: 4096.0 | grad norm: 56330.990 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 13:10:06] PULSE: tr8-104B is scheduled to start in 17:54:24 (at 2021-09-26T07:04:31) (1188168 on 'gpu_p13' partition) -[2021-09-25 13:10:06] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 13:10:06] PULSE: tr8-104B is running for 8:43:05 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 7880/ 159576 | consumed samples: 359280 | elapsed time per iteration (ms): 20746.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.156522E+00 | loss scale: 4096.0 | grad norm: 137295.607 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7890/ 159576 | consumed samples: 360560 | elapsed time per iteration (ms): 20983.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.996352E+00 | loss scale: 4096.0 | grad norm: 67763.557 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7900/ 159576 | consumed samples: 361840 | elapsed time per iteration (ms): 20640.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.985654E+00 | loss scale: 4096.0 | grad norm: 113013.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7910/ 159576 | consumed samples: 363120 | elapsed time per iteration (ms): 20742.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.976338E+00 | loss scale: 4096.0 | grad norm: 73140.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7920/ 159576 | consumed samples: 364400 | elapsed time per iteration (ms): 20679.4 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.917073E+00 | loss scale: 4096.0 | grad norm: 83861.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7930/ 159576 | consumed samples: 365680 | elapsed time per iteration (ms): 20531.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.971965E+00 | loss scale: 4096.0 | grad norm: 57978.154 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7940/ 159576 | consumed samples: 366960 | elapsed time per iteration (ms): 20446.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.117603E+00 | loss scale: 4096.0 | grad norm: 218144.909 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7950/ 159576 | consumed samples: 368240 | elapsed time per iteration (ms): 20823.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.029739E+00 | loss scale: 4096.0 | grad norm: 46987.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7960/ 159576 | consumed samples: 369520 | elapsed time per iteration (ms): 20775.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.972835E+00 | loss scale: 4096.0 | grad norm: 59193.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7970/ 159576 | consumed samples: 370800 | elapsed time per iteration (ms): 20508.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.890491E+00 | loss scale: 8192.0 | grad norm: 102786.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7980/ 159576 | consumed samples: 372080 | elapsed time per iteration (ms): 20983.1 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.927078E+00 | loss scale: 8192.0 | grad norm: 117997.551 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7990/ 159576 | consumed samples: 373360 | elapsed time per iteration (ms): 20495.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.823578E+00 | loss scale: 8192.0 | grad norm: 123947.033 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 13:53:58,625] [INFO] [logging.py:68:log_dist] [Rank 0] step=8000, skipped=17, lr=[5.999979430007177e-05, 5.999979430007177e-05], mom=[(0.9, 0.999), (0.9, 0.999)] -steps: 8000 loss: 6.8207 iter time (s): 0.010 samples/sec: 13060.948 - iteration 8000/ 159576 | consumed samples: 374640 | elapsed time per iteration (ms): 20659.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.884979E+00 | loss scale: 8192.0 | grad norm: 131468.178 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 8000 | lm loss value: 6.791678E+00 | lm loss PPL: 8.904064E+02 | ------------------------------------------------------------------------------------------------- - iteration 8010/ 159576 | consumed samples: 375920 | elapsed time per iteration (ms): 22008.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.826038E+00 | loss scale: 8192.0 | grad norm: 154245.241 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8020/ 159576 | consumed samples: 377200 | elapsed time per iteration (ms): 20587.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.870419E+00 | loss scale: 8192.0 | grad norm: 129858.542 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8030/ 159576 | consumed samples: 378544 | elapsed time per iteration (ms): 21288.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.928481E+00 | loss scale: 8192.0 | grad norm: 226677.481 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8040/ 159576 | consumed samples: 379984 | elapsed time per iteration (ms): 21881.6 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.896291E+00 | loss scale: 8192.0 | grad norm: 205623.823 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 14:10:08] PULSE: tr8-104B is scheduled to start in 17:26:04 (at 2021-09-26T07:36:13) (1188168 on 'gpu_p13' partition) -[2021-09-25 14:10:08] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 14:10:08] PULSE: tr8-104B is running for 9:43:07 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8050/ 159576 | consumed samples: 381424 | elapsed time per iteration (ms): 21696.5 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.873873E+00 | loss scale: 8192.0 | grad norm: 146153.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8060/ 159576 | consumed samples: 382864 | elapsed time per iteration (ms): 21810.7 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.853185E+00 | loss scale: 8192.0 | grad norm: 101607.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8070/ 159576 | consumed samples: 384304 | elapsed time per iteration (ms): 21802.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.850246E+00 | loss scale: 8192.0 | grad norm: 139070.087 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8080/ 159576 | consumed samples: 385744 | elapsed time per iteration (ms): 21831.7 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.848817E+00 | loss scale: 8192.0 | grad norm: 129639.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8090/ 159576 | consumed samples: 387184 | elapsed time per iteration (ms): 21715.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.856639E+00 | loss scale: 8192.0 | grad norm: 200364.806 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8100/ 159576 | consumed samples: 388624 | elapsed time per iteration (ms): 21801.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.869398E+00 | loss scale: 8192.0 | grad norm: 141893.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8110/ 159576 | consumed samples: 390064 | elapsed time per iteration (ms): 21693.5 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.834469E+00 | loss scale: 8192.0 | grad norm: 133792.650 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8120/ 159576 | consumed samples: 391504 | elapsed time per iteration (ms): 21798.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.845126E+00 | loss scale: 8192.0 | grad norm: 196465.435 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8130/ 159576 | consumed samples: 392944 | elapsed time per iteration (ms): 21718.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.864041E+00 | loss scale: 8192.0 | grad norm: 234002.522 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8140/ 159576 | consumed samples: 394384 | elapsed time per iteration (ms): 20974.7 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.866895E+00 | loss scale: 8192.0 | grad norm: 214792.051 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8150/ 159576 | consumed samples: 395824 | elapsed time per iteration (ms): 20962.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.949483E+00 | loss scale: 4096.0 | grad norm: 129105.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8160/ 159576 | consumed samples: 397264 | elapsed time per iteration (ms): 21839.6 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.982524E+00 | loss scale: 4096.0 | grad norm: 104094.455 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8170/ 159576 | consumed samples: 398704 | elapsed time per iteration (ms): 21626.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.968035E+00 | loss scale: 4096.0 | grad norm: 85705.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8180/ 159576 | consumed samples: 400144 | elapsed time per iteration (ms): 21733.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.983526E+00 | loss scale: 4096.0 | grad norm: 140563.515 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8190/ 159576 | consumed samples: 401584 | elapsed time per iteration (ms): 21768.5 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 7.016048E+00 | loss scale: 4096.0 | grad norm: 72531.033 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8200/ 159576 | consumed samples: 403024 | elapsed time per iteration (ms): 21929.8 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.996774E+00 | loss scale: 4096.0 | grad norm: 128628.095 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8210/ 159576 | consumed samples: 404464 | elapsed time per iteration (ms): 21876.8 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.954953E+00 | loss scale: 4096.0 | grad norm: 114237.351 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 15:10:12] PULSE: tr8-104B is scheduled to start in 20:25:18 (at 2021-09-26T11:35:31) (1188168 on 'gpu_p13' partition) -[2021-09-25 15:10:12] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 15:10:12] PULSE: tr8-104B is running for 10:43:11 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8220/ 159576 | consumed samples: 405904 | elapsed time per iteration (ms): 21992.9 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.927856E+00 | loss scale: 4096.0 | grad norm: 191859.936 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8230/ 159576 | consumed samples: 407344 | elapsed time per iteration (ms): 21845.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.915263E+00 | loss scale: 4096.0 | grad norm: 136325.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8240/ 159576 | consumed samples: 408784 | elapsed time per iteration (ms): 21179.2 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.864025E+00 | loss scale: 2048.0 | grad norm: 118355.574 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8250/ 159576 | consumed samples: 410224 | elapsed time per iteration (ms): 21688.2 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.873029E+00 | loss scale: 2048.0 | grad norm: 72612.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8260/ 159576 | consumed samples: 411664 | elapsed time per iteration (ms): 21621.0 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.963725E+00 | loss scale: 2048.0 | grad norm: 77677.833 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8270/ 159576 | consumed samples: 413104 | elapsed time per iteration (ms): 21832.0 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.939199E+00 | loss scale: 2048.0 | grad norm: 80021.251 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8280/ 159576 | consumed samples: 414544 | elapsed time per iteration (ms): 21967.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.919482E+00 | loss scale: 2048.0 | grad norm: 58905.568 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8290/ 159576 | consumed samples: 415984 | elapsed time per iteration (ms): 21671.6 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.919662E+00 | loss scale: 2048.0 | grad norm: 52571.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8300/ 159576 | consumed samples: 417424 | elapsed time per iteration (ms): 21755.6 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 7.024297E+00 | loss scale: 2048.0 | grad norm: 77079.083 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8310/ 159576 | consumed samples: 418864 | elapsed time per iteration (ms): 21909.8 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 7.234490E+00 | loss scale: 2048.0 | grad norm: 102216.544 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8320/ 159576 | consumed samples: 420304 | elapsed time per iteration (ms): 21566.6 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 7.228243E+00 | loss scale: 2048.0 | grad norm: 88135.536 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8330/ 159576 | consumed samples: 421744 | elapsed time per iteration (ms): 22069.0 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 7.068048E+00 | loss scale: 2048.0 | grad norm: 65341.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8340/ 159576 | consumed samples: 423184 | elapsed time per iteration (ms): 21682.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 7.049673E+00 | loss scale: 2048.0 | grad norm: 45586.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8350/ 159576 | consumed samples: 424624 | elapsed time per iteration (ms): 21918.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 7.033588E+00 | loss scale: 2048.0 | grad norm: 60230.392 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8360/ 159576 | consumed samples: 426160 | elapsed time per iteration (ms): 22474.7 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.032515E+00 | loss scale: 2048.0 | grad norm: 55714.258 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8370/ 159576 | consumed samples: 427760 | elapsed time per iteration (ms): 22723.0 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.051062E+00 | loss scale: 2048.0 | grad norm: 68784.584 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 16:10:22] PULSE: tr8-104B is scheduled to start in 19:16:12 (at 2021-09-26T11:26:35) (1188168 on 'gpu_p13' partition) -[2021-09-25 16:10:22] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 16:10:22] PULSE: tr8-104B is running for 11:43:21 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8380/ 159576 | consumed samples: 429360 | elapsed time per iteration (ms): 22974.1 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.025337E+00 | loss scale: 2048.0 | grad norm: 89725.468 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8390/ 159576 | consumed samples: 430960 | elapsed time per iteration (ms): 22266.9 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.010270E+00 | loss scale: 1024.0 | grad norm: 33629.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8400/ 159576 | consumed samples: 432560 | elapsed time per iteration (ms): 22964.2 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.020833E+00 | loss scale: 1024.0 | grad norm: 46812.316 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8410/ 159576 | consumed samples: 434160 | elapsed time per iteration (ms): 22923.5 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.044554E+00 | loss scale: 1024.0 | grad norm: 55335.802 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8420/ 159576 | consumed samples: 435760 | elapsed time per iteration (ms): 22690.3 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.074860E+00 | loss scale: 1024.0 | grad norm: 27018.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8430/ 159576 | consumed samples: 437360 | elapsed time per iteration (ms): 22997.6 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.108445E+00 | loss scale: 1024.0 | grad norm: 95058.404 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8440/ 159576 | consumed samples: 438960 | elapsed time per iteration (ms): 22696.4 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.128921E+00 | loss scale: 1024.0 | grad norm: 44470.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8450/ 159576 | consumed samples: 440560 | elapsed time per iteration (ms): 22728.4 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.037349E+00 | loss scale: 1024.0 | grad norm: 32995.810 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8460/ 159576 | consumed samples: 442160 | elapsed time per iteration (ms): 22856.0 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.064864E+00 | loss scale: 1024.0 | grad norm: 23093.772 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8470/ 159576 | consumed samples: 443760 | elapsed time per iteration (ms): 22824.5 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.057752E+00 | loss scale: 1024.0 | grad norm: 34580.324 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8480/ 159576 | consumed samples: 445360 | elapsed time per iteration (ms): 22939.9 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.111783E+00 | loss scale: 1024.0 | grad norm: 30415.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8490/ 159576 | consumed samples: 446960 | elapsed time per iteration (ms): 22647.3 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.077787E+00 | loss scale: 1024.0 | grad norm: 44228.518 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8500/ 159576 | consumed samples: 448560 | elapsed time per iteration (ms): 22870.1 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.017307E+00 | loss scale: 1024.0 | grad norm: 31106.331 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 17:00:02] PULSE: tr8-104B is scheduled to start in 18:26:32 (at 2021-09-26T11:26:35) (1188168 on 'gpu_p13' partition) -[2021-09-25 17:00:02] PULSE: tr8-104B is waiting for the previous Job Array job to finish before scheduling a new one (1185639_[2-10%1] on 'gpu_p13' partition) -[2021-09-25 17:00:02] PULSE: tr8-104B is running for 12:33:01 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8510/ 159576 | consumed samples: 450160 | elapsed time per iteration (ms): 22836.1 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.033496E+00 | loss scale: 1024.0 | grad norm: 84589.712 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8520/ 159576 | consumed samples: 451760 | elapsed time per iteration (ms): 22678.6 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.034415E+00 | loss scale: 1024.0 | grad norm: 45889.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8530/ 159576 | consumed samples: 453360 | elapsed time per iteration (ms): 22820.3 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.022775E+00 | loss scale: 1024.0 | grad norm: 46421.613 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 17:10:31] PULSE: tr8-104B is scheduled to start in 18:16:03 (at 2021-09-26T11:26:35) (1188168 on 'gpu_p13' partition) -[2021-09-25 17:10:31] PULSE: tr8-104B is running for 12:43:30 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8540/ 159576 | consumed samples: 454960 | elapsed time per iteration (ms): 22803.2 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.015056E+00 | loss scale: 1024.0 | grad norm: 49138.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8550/ 159576 | consumed samples: 456560 | elapsed time per iteration (ms): 22969.4 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.037695E+00 | loss scale: 1024.0 | grad norm: 72675.159 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8560/ 159576 | consumed samples: 458160 | elapsed time per iteration (ms): 22624.1 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.040105E+00 | loss scale: 1024.0 | grad norm: 55417.219 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8570/ 159576 | consumed samples: 459760 | elapsed time per iteration (ms): 22663.1 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.066528E+00 | loss scale: 1024.0 | grad norm: 48492.969 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 17:26:58] PULSE: tr8-104B is scheduled to start in 17:59:36 (at 2021-09-26T11:26:35) (1188168 on 'gpu_p13' partition) -[2021-09-25 17:26:58] PULSE: tr8-104B is running for 12:59:57 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8580/ 159576 | consumed samples: 461360 | elapsed time per iteration (ms): 22688.8 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.087028E+00 | loss scale: 1024.0 | grad norm: 46974.842 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8590/ 159576 | consumed samples: 462960 | elapsed time per iteration (ms): 22699.4 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.089204E+00 | loss scale: 1024.0 | grad norm: 44702.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8600/ 159576 | consumed samples: 464560 | elapsed time per iteration (ms): 22777.7 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.149306E+00 | loss scale: 1024.0 | grad norm: 261339.801 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8610/ 159576 | consumed samples: 466160 | elapsed time per iteration (ms): 22975.5 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.167276E+00 | loss scale: 1024.0 | grad norm: 105455.551 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8620/ 159576 | consumed samples: 467760 | elapsed time per iteration (ms): 23048.5 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.078442E+00 | loss scale: 1024.0 | grad norm: 84212.423 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8630/ 159576 | consumed samples: 469360 | elapsed time per iteration (ms): 22799.5 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.081234E+00 | loss scale: 1024.0 | grad norm: 52121.419 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8640/ 159576 | consumed samples: 470960 | elapsed time per iteration (ms): 22720.5 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.109283E+00 | loss scale: 1024.0 | grad norm: 48651.489 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8650/ 159576 | consumed samples: 472560 | elapsed time per iteration (ms): 22695.2 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.118199E+00 | loss scale: 1024.0 | grad norm: 26046.891 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8660/ 159576 | consumed samples: 474320 | elapsed time per iteration (ms): 23933.5 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.064212E+00 | loss scale: 1024.0 | grad norm: 40523.058 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8670/ 159576 | consumed samples: 476080 | elapsed time per iteration (ms): 23798.1 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.051229E+00 | loss scale: 1024.0 | grad norm: 28160.238 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8680/ 159576 | consumed samples: 477840 | elapsed time per iteration (ms): 23923.9 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.036906E+00 | loss scale: 1024.0 | grad norm: 51047.866 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8690/ 159576 | consumed samples: 479600 | elapsed time per iteration (ms): 23651.1 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.073657E+00 | loss scale: 1024.0 | grad norm: 141610.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 18:10:35] PULSE: tr8-104B is scheduled to start in 17:15:59 (at 2021-09-26T11:26:35) (1188168 on 'gpu_p13' partition) -[2021-09-25 18:10:35] PULSE: tr8-104B is running for 13:43:34 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8700/ 159576 | consumed samples: 481360 | elapsed time per iteration (ms): 23943.4 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.071510E+00 | loss scale: 1024.0 | grad norm: 24381.440 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8710/ 159576 | consumed samples: 483120 | elapsed time per iteration (ms): 23910.3 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.190697E+00 | loss scale: 1024.0 | grad norm: 41525.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8720/ 159576 | consumed samples: 484880 | elapsed time per iteration (ms): 23923.5 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.332158E+00 | loss scale: 1024.0 | grad norm: 23580.074 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8730/ 159576 | consumed samples: 486640 | elapsed time per iteration (ms): 23664.9 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.250137E+00 | loss scale: 1024.0 | grad norm: 33934.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8740/ 159576 | consumed samples: 488400 | elapsed time per iteration (ms): 24002.8 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.134158E+00 | loss scale: 1024.0 | grad norm: 18917.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8750/ 159576 | consumed samples: 490160 | elapsed time per iteration (ms): 23812.9 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.133132E+00 | loss scale: 1024.0 | grad norm: 24524.875 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8760/ 159576 | consumed samples: 491920 | elapsed time per iteration (ms): 24164.0 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.089709E+00 | loss scale: 1024.0 | grad norm: 18466.411 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8770/ 159576 | consumed samples: 493680 | elapsed time per iteration (ms): 23763.0 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.075866E+00 | loss scale: 1024.0 | grad norm: 21160.208 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8780/ 159576 | consumed samples: 495440 | elapsed time per iteration (ms): 23757.0 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.105405E+00 | loss scale: 1024.0 | grad norm: 21012.399 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8790/ 159576 | consumed samples: 497200 | elapsed time per iteration (ms): 23726.0 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.119524E+00 | loss scale: 1024.0 | grad norm: 19184.310 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 18:51:17] PULSE: tr8-104B is scheduled to start in 19:55:07 (at 2021-09-26T14:46:25) (1188168 on 'gpu_p13' partition) -[2021-09-25 18:51:17] PULSE: tr8-104B is running for 14:24:16 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8800/ 159576 | consumed samples: 498960 | elapsed time per iteration (ms): 23872.5 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.150304E+00 | loss scale: 1024.0 | grad norm: 20582.002 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8810/ 159576 | consumed samples: 500720 | elapsed time per iteration (ms): 23674.3 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.121466E+00 | loss scale: 1024.0 | grad norm: 26026.638 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8820/ 159576 | consumed samples: 502480 | elapsed time per iteration (ms): 23655.3 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.227619E+00 | loss scale: 1024.0 | grad norm: 19493.231 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8830/ 159576 | consumed samples: 504240 | elapsed time per iteration (ms): 24040.7 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.202127E+00 | loss scale: 1024.0 | grad norm: 21130.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8840/ 159576 | consumed samples: 506000 | elapsed time per iteration (ms): 23751.6 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.102602E+00 | loss scale: 1024.0 | grad norm: 15258.781 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 19:10:38] PULSE: tr8-104B is scheduled to start in 19:35:46 (at 2021-09-26T14:46:25) (1188168 on 'gpu_p13' partition) -[2021-09-25 19:10:38] PULSE: tr8-104B is running for 14:43:37 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 8850/ 159576 | consumed samples: 507760 | elapsed time per iteration (ms): 23681.3 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.106478E+00 | loss scale: 1024.0 | grad norm: 15650.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8860/ 159576 | consumed samples: 509520 | elapsed time per iteration (ms): 23830.0 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.077826E+00 | loss scale: 1024.0 | grad norm: 13271.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8870/ 159576 | consumed samples: 511280 | elapsed time per iteration (ms): 23830.3 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.083195E+00 | loss scale: 1024.0 | grad norm: 13942.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8880/ 159576 | consumed samples: 513040 | elapsed time per iteration (ms): 23893.7 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.101151E+00 | loss scale: 1024.0 | grad norm: 17666.067 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8890/ 159576 | consumed samples: 514800 | elapsed time per iteration (ms): 23733.4 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.130984E+00 | loss scale: 2048.0 | grad norm: 41179.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8900/ 159576 | consumed samples: 516560 | elapsed time per iteration (ms): 23693.0 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.084023E+00 | loss scale: 2048.0 | grad norm: 32703.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8910/ 159576 | consumed samples: 518320 | elapsed time per iteration (ms): 23793.1 | learning rate: 6.000E-05 | global batch size: 176 | lm loss: 7.094463E+00 | loss scale: 2048.0 | grad norm: 46954.552 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8920/ 159576 | consumed samples: 520112 | elapsed time per iteration (ms): 23988.6 | learning rate: 6.000E-05 | global batch size: 192 | lm loss: 7.094890E+00 | loss scale: 2048.0 | grad norm: 20910.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8930/ 159576 | consumed samples: 522032 | elapsed time per iteration (ms): 24780.5 | learning rate: 6.000E-05 | global batch size: 192 | lm loss: 7.112840E+00 | loss scale: 2048.0 | grad norm: 23723.304 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8940/ 159576 | consumed samples: 523952 | elapsed time per iteration (ms): 24880.9 | learning rate: 6.000E-05 | global batch size: 192 | lm loss: 7.157214E+00 | loss scale: 2048.0 | grad norm: 35769.072 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8950/ 159576 | consumed samples: 525872 | elapsed time per iteration (ms): 24820.3 | learning rate: 6.000E-05 | global batch size: 192 | lm loss: 7.212303E+00 | loss scale: 2048.0 | grad norm: 20241.796 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8960/ 159576 | consumed samples: 527792 | elapsed time per iteration (ms): 24706.7 | learning rate: 6.000E-05 | global batch size: 192 | lm loss: 7.215181E+00 | loss scale: 2048.0 | grad norm: 48969.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8970/ 159576 | consumed samples: 529712 | elapsed time per iteration (ms): 23528.3 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1024.0 | grad norm: 156762.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8980/ 159576 | consumed samples: 531632 | elapsed time per iteration (ms): 18302.5 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 2.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8990/ 159576 | consumed samples: 533552 | elapsed time per iteration (ms): 17645.0 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 20:10:52] PULSE: tr8-104B is scheduled to start in 18:35:32 (at 2021-09-26T14:46:25) (1188168 on 'gpu_p13' partition) -[2021-09-25 20:10:52] PULSE: tr8-104B is running for 15:43:51 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 9000/ 159576 | consumed samples: 535472 | elapsed time per iteration (ms): 17316.3 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 9000 | lm loss value: 7.256732E+00 | lm loss PPL: 1.417617E+03 | ------------------------------------------------------------------------------------------------- -saving checkpoint at iteration 9000 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-25 20:11:32,719] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step9000/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 9000 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 17709.49 - iteration 9010/ 159576 | consumed samples: 537392 | elapsed time per iteration (ms): 21623.6 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9020/ 159576 | consumed samples: 539312 | elapsed time per iteration (ms): 17559.0 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9030/ 159576 | consumed samples: 541232 | elapsed time per iteration (ms): 17827.7 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9040/ 159576 | consumed samples: 543152 | elapsed time per iteration (ms): 17458.2 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9050/ 159576 | consumed samples: 545072 | elapsed time per iteration (ms): 17470.7 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9060/ 159576 | consumed samples: 546992 | elapsed time per iteration (ms): 17813.0 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9070/ 159576 | consumed samples: 548912 | elapsed time per iteration (ms): 17646.8 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9080/ 159576 | consumed samples: 550832 | elapsed time per iteration (ms): 17634.4 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9090/ 159576 | consumed samples: 552752 | elapsed time per iteration (ms): 17734.2 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9100/ 159576 | consumed samples: 554672 | elapsed time per iteration (ms): 17470.3 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9110/ 159576 | consumed samples: 556592 | elapsed time per iteration (ms): 17443.8 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9120/ 159576 | consumed samples: 558512 | elapsed time per iteration (ms): 17456.2 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9130/ 159576 | consumed samples: 560432 | elapsed time per iteration (ms): 17374.7 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9140/ 159576 | consumed samples: 562352 | elapsed time per iteration (ms): 17541.4 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9150/ 159576 | consumed samples: 564272 | elapsed time per iteration (ms): 17680.4 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9160/ 159576 | consumed samples: 566192 | elapsed time per iteration (ms): 17412.1 | learning rate: 6.000E-05 | global batch size: 192 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9170/ 159576 | consumed samples: 568208 | elapsed time per iteration (ms): 18281.1 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9180/ 159576 | consumed samples: 570288 | elapsed time per iteration (ms): 18627.2 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9190/ 159576 | consumed samples: 572368 | elapsed time per iteration (ms): 18546.6 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 21:10:54] PULSE: tr8-104B is scheduled to start in 17:35:30 (at 2021-09-26T14:46:25) (1188168 on 'gpu_p13' partition) -[2021-09-25 21:10:54] PULSE: tr8-104B is running for 16:43:53 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 9200/ 159576 | consumed samples: 574448 | elapsed time per iteration (ms): 18675.7 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9210/ 159576 | consumed samples: 576528 | elapsed time per iteration (ms): 18679.9 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9220/ 159576 | consumed samples: 578608 | elapsed time per iteration (ms): 18524.7 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9230/ 159576 | consumed samples: 580688 | elapsed time per iteration (ms): 18762.7 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9240/ 159576 | consumed samples: 582768 | elapsed time per iteration (ms): 18695.7 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9250/ 159576 | consumed samples: 584848 | elapsed time per iteration (ms): 18780.0 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9260/ 159576 | consumed samples: 586928 | elapsed time per iteration (ms): 18593.2 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9270/ 159576 | consumed samples: 589008 | elapsed time per iteration (ms): 18476.6 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9280/ 159576 | consumed samples: 591088 | elapsed time per iteration (ms): 18595.2 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9290/ 159576 | consumed samples: 593168 | elapsed time per iteration (ms): 18498.1 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9300/ 159576 | consumed samples: 595248 | elapsed time per iteration (ms): 18531.6 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9310/ 159576 | consumed samples: 597328 | elapsed time per iteration (ms): 18538.6 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9320/ 159576 | consumed samples: 599408 | elapsed time per iteration (ms): 18768.3 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9330/ 159576 | consumed samples: 601488 | elapsed time per iteration (ms): 18445.0 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9340/ 159576 | consumed samples: 603568 | elapsed time per iteration (ms): 18700.8 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9350/ 159576 | consumed samples: 605648 | elapsed time per iteration (ms): 18716.7 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9360/ 159576 | consumed samples: 607728 | elapsed time per iteration (ms): 18488.0 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9370/ 159576 | consumed samples: 609808 | elapsed time per iteration (ms): 18621.0 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9380/ 159576 | consumed samples: 611888 | elapsed time per iteration (ms): 18781.4 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9390/ 159576 | consumed samples: 613968 | elapsed time per iteration (ms): 18582.4 | learning rate: 6.000E-05 | global batch size: 208 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 22:11:04] PULSE: tr8-104B is scheduled to start in 17:17:05 (at 2021-09-26T15:28:10) (1188168 on 'gpu_p13' partition) -[2021-09-25 22:11:04] PULSE: tr8-104B is running for 17:44:03 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 9400/ 159576 | consumed samples: 616192 | elapsed time per iteration (ms): 19918.8 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9410/ 159576 | consumed samples: 618432 | elapsed time per iteration (ms): 19675.6 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9420/ 159576 | consumed samples: 620672 | elapsed time per iteration (ms): 19904.3 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9430/ 159576 | consumed samples: 622912 | elapsed time per iteration (ms): 19702.9 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9440/ 159576 | consumed samples: 625152 | elapsed time per iteration (ms): 19798.2 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9450/ 159576 | consumed samples: 627392 | elapsed time per iteration (ms): 19797.6 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9460/ 159576 | consumed samples: 629632 | elapsed time per iteration (ms): 20223.0 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9470/ 159576 | consumed samples: 631872 | elapsed time per iteration (ms): 19847.6 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9480/ 159576 | consumed samples: 634112 | elapsed time per iteration (ms): 19783.5 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9490/ 159576 | consumed samples: 636352 | elapsed time per iteration (ms): 19768.8 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9500/ 159576 | consumed samples: 638592 | elapsed time per iteration (ms): 19836.7 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9510/ 159576 | consumed samples: 640832 | elapsed time per iteration (ms): 19791.2 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9520/ 159576 | consumed samples: 643072 | elapsed time per iteration (ms): 19677.8 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9530/ 159576 | consumed samples: 645312 | elapsed time per iteration (ms): 19695.3 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9540/ 159576 | consumed samples: 647552 | elapsed time per iteration (ms): 19697.0 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9550/ 159576 | consumed samples: 649792 | elapsed time per iteration (ms): 19776.4 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9560/ 159576 | consumed samples: 652032 | elapsed time per iteration (ms): 19726.6 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9570/ 159576 | consumed samples: 654272 | elapsed time per iteration (ms): 19764.1 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-25 23:11:05] PULSE: tr8-104B is scheduled to start in 18:13:44 (at 2021-09-26T17:24:50) (1188168 on 'gpu_p13' partition) -[2021-09-25 23:11:05] PULSE: tr8-104B is running for 18:44:04 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 9580/ 159576 | consumed samples: 656512 | elapsed time per iteration (ms): 19889.3 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9590/ 159576 | consumed samples: 658752 | elapsed time per iteration (ms): 19672.3 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9600/ 159576 | consumed samples: 660992 | elapsed time per iteration (ms): 19668.0 | learning rate: 6.000E-05 | global batch size: 224 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9610/ 159576 | consumed samples: 663360 | elapsed time per iteration (ms): 20660.1 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9620/ 159576 | consumed samples: 665760 | elapsed time per iteration (ms): 20759.5 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9630/ 159576 | consumed samples: 668160 | elapsed time per iteration (ms): 20573.3 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9640/ 159576 | consumed samples: 670560 | elapsed time per iteration (ms): 21117.4 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9650/ 159576 | consumed samples: 672960 | elapsed time per iteration (ms): 21312.3 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9660/ 159576 | consumed samples: 675360 | elapsed time per iteration (ms): 20596.0 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9670/ 159576 | consumed samples: 677760 | elapsed time per iteration (ms): 20413.4 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9680/ 159576 | consumed samples: 680160 | elapsed time per iteration (ms): 20820.1 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9690/ 159576 | consumed samples: 682560 | elapsed time per iteration (ms): 20882.2 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9700/ 159576 | consumed samples: 684960 | elapsed time per iteration (ms): 21320.0 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9710/ 159576 | consumed samples: 687360 | elapsed time per iteration (ms): 20632.6 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9720/ 159576 | consumed samples: 689760 | elapsed time per iteration (ms): 20593.0 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9730/ 159576 | consumed samples: 692160 | elapsed time per iteration (ms): 21160.0 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9740/ 159576 | consumed samples: 694560 | elapsed time per iteration (ms): 20918.8 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-26 00:11:13] PULSE: tr8-104B is scheduled to start in 17:13:36 (at 2021-09-26T17:24:50) (1188168 on 'gpu_p13' partition) -[2021-09-26 00:11:13] PULSE: tr8-104B is running for 19:44:12 since 2021-09-25T04:27:01 (1185639_1 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r7i2n[4-5],r7i3n2,r7i6n[2-4],r7i7n[7-8],r8i0n[0,2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-5],r9i1n[0-3],r9i2n[3-6,8],r9i3n[0-1,7-8],r9i4n[0-3],r9i5n[3-8],r9i6n0) - iteration 9750/ 159576 | consumed samples: 696960 | elapsed time per iteration (ms): 20828.1 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9760/ 159576 | consumed samples: 699360 | elapsed time per iteration (ms): 20766.8 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 5927.930 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 9768 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-26 00:17:36,090] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step9768/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 9768 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 22024.89 -[exiting program after 1190.3113538821538 minutes] datetime: 2021-09-26 00:17:52 -[2021-09-26 01:11:06] PULSE: tr8-104B is scheduled to start in 18:25:25 (at 2021-09-26T19:36:32) (1188168 on 'gpu_p13' partition) -[2021-09-26 02:11:19] PULSE: tr8-104B is scheduled to start in 17:25:12 (at 2021-09-26T19:36:32) (1188168 on 'gpu_p13' partition) -[2021-09-26 03:11:35] PULSE: tr8-104B is scheduled to start in 19:51:55 (at 2021-09-26T23:03:31) (1188168 on 'gpu_p13' partition) -[2021-09-26 04:11:39] PULSE: tr8-104B is scheduled to start in 19:06:56 (at 2021-09-26T23:18:36) (1188168 on 'gpu_p13' partition) -[2021-09-26 05:11:41] PULSE: tr8-104B is scheduled to start in 18:19:12 (at 2021-09-26T23:30:54) (1188168 on 'gpu_p13' partition) -[2021-09-26 06:11:46] PULSE: tr8-104B is scheduled to start in 17:19:07 (at 2021-09-26T23:30:54) (1188168 on 'gpu_p13' partition) -[2021-09-26 07:11:59] PULSE: tr8-104B is scheduled to start in 17:27:45 (at 2021-09-27T00:39:45) (1188168 on 'gpu_p13' partition) -[2021-09-26 08:12:02] PULSE: tr8-104B is scheduled to start in 12:30:49 (at 2021-09-26T20:42:52) (1188168 on 'gpu_p13' partition) -[2021-09-26 09:12:23] PULSE: tr8-104B is scheduled to start in 11:30:28 (at 2021-09-26T20:42:52) (1188168 on 'gpu_p13' partition) -[2021-09-26 10:12:24] PULSE: tr8-104B is scheduled to start in 10:30:27 (at 2021-09-26T20:42:52) (1188168 on 'gpu_p13' partition) -[2021-09-26 11:12:28] PULSE: tr8-104B is scheduled to start in 9:30:23 (at 2021-09-26T20:42:52) (1188168 on 'gpu_p13' partition) -[2021-09-26 12:12:40] PULSE: tr8-104B is scheduled to start in 10:14:45 (at 2021-09-26T22:27:26) (1188168 on 'gpu_p13' partition) -[2021-09-26 13:12:49] PULSE: tr8-104B is scheduled to start in 9:14:36 (at 2021-09-26T22:27:26) (1188168 on 'gpu_p13' partition) -[2021-09-26 14:12:56] PULSE: tr8-104B is scheduled to start in 8:33:42 (at 2021-09-26T22:46:39) (1188168 on 'gpu_p13' partition) -[2021-09-26 15:13:22] PULSE: tr8-104B is scheduled to start in 7:16:41 (at 2021-09-26T22:30:04) (1188168 on 'gpu_p13' partition) -[2021-09-26 16:13:24] PULSE: tr8-104B is scheduled to start in 6:16:39 (at 2021-09-26T22:30:04) (1188168 on 'gpu_p13' partition) -[2021-09-26 17:13:32] PULSE: tr8-104B is scheduled to start in 5:16:31 (at 2021-09-26T22:30:04) (1188168 on 'gpu_p13' partition) -[2021-09-26 18:13:29] PULSE: tr8-104B is scheduled to start in 9:13:25 (at 2021-09-27T03:26:55) (1188168 on 'gpu_p13' partition) -[2021-09-26 19:13:42] PULSE: tr8-104B is scheduled to start in 12:06:13 (at 2021-09-27T07:19:56) (1188168 on 'gpu_p13' partition) -[2021-09-26 20:13:45] PULSE: tr8-104B is scheduled to start in 11:06:10 (at 2021-09-27T07:19:56) (1188168 on 'gpu_p13' partition) -[2021-09-26 21:14:04] PULSE: tr8-104B is scheduled to start in 18:20:04 (at 2021-09-27T15:34:09) (1188168 on 'gpu_p13' partition) -[2021-09-26 22:14:04] PULSE: tr8-104B is scheduled to start in 17:20:04 (at 2021-09-27T15:34:09) (1188168 on 'gpu_p13' partition) -[2021-09-26 23:14:12] PULSE: tr8-104B is scheduled to start in 16:36:40 (at 2021-09-27T15:50:53) (1188168 on 'gpu_p13' partition) -[2021-09-27 00:14:11] PULSE: tr8-104B is scheduled to start in 15:32:33 (at 2021-09-27T15:46:45) (1188168 on 'gpu_p13' partition) -[2021-09-27 01:14:15] PULSE: tr8-104B is scheduled to start in 14:32:29 (at 2021-09-27T15:46:45) (1188168 on 'gpu_p13' partition) -[2021-09-27 02:14:18] PULSE: tr8-104B is scheduled to start in 10:17:12 (at 2021-09-27T12:31:31) (1188168 on 'gpu_p13' partition) -[2021-09-27 03:14:23] PULSE: tr8-104B is scheduled to start in 9:17:07 (at 2021-09-27T12:31:31) (1188168 on 'gpu_p13' partition) -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -ninja .................. [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -ninjastochastic_transformer ................... [OKAY][NO] - .......-------------------------------------------------- -[OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -ninja .................. [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -op name ................ installed .. compatible -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -op name ................ installed .. compatible -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -fused_adam ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -op name ................ installed .. compatible -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -fused_adam ............. [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -sparse_attn ............ [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -ninjaninja .................................... [OKAY][OKAY] - --------------------------------------------------- ---------------------------------------------------op name -transformer ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -................ op nameinstalled .................. installedcompatible - ..-------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -compatible --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] cpu_adam...... [OKAY]............... -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] .......fused_adam [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -............. [NO]fused_lamb .................... [NO][OKAY] -sparse_attn ............ [NO] ....... [OKAY] -....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn transformer............ ............[NO] [NO] .............. [OKAY][OKAY] - -stochastic_transformertransformer ............. [NO][NO] ....... .......[OKAY] -[OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -sparse_attn ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninja .................. [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninja .................. [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -fused_adam-------------------------------------------------- -.............JIT compiled ops requires ninja -[NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -ninja .................. [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -sparse_attn ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -ninja .................. [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -quantizer utils.............. ..................[NO] [YES]....... ......[OKAY] -[OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adam ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -fused_lamb ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -op name ................ installed .. compatible -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -fused_adam ............. [NO] ....... [OKAY] - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -async_io ............... [NO] ....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] - [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... utils[NO] ......................... [YES][NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... async_io[NO] - ............... async_io[NO] ...................... [NO] -...... [OKAY] -[NO] ....... transformer_inference[NO] -.. [NO] ....... [OKAY] -quantizer .............. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -transformer_inference transformer_inferenceutils.. ....................[NO] [NO][YES]....... .............[OKAY] -[OKAY][OKAY] - --------------------------------------------------- -quantizer utils.............. utils..................[NO] [YES]......................... ...... [OKAY] [YES] -[OKAY] -...... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] .......[NO] [NO]....... - [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io async_io............... ...............[NO] .......[NO] [NO]....... - [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] - --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference ..utils [NO].................. .......[YES] ......[OKAY] -[OKAY] -async_io ............... [NO] ....... [NO] -quantizer utils.............. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninja .................................... [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -op name op name................ installed................ .. installedcompatible -..-------------------------------------------------- -compatible --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam...... ...............[OKAY] -[YES] ...... [OKAY] -fused_adam ............. [NO]fused_adam ....... .............[OKAY] -[NO] fused_lamb....... .............[OKAY] -[NO] ....... fused_lamb[OKAY] -............. [NO] ....... [OKAY] -sparse_attn ............ [NO] .......sparse_attn [OKAY] -............ [NO]transformer ................... [NO][OKAY] -....... [OKAY] -transformer ............ stochastic_transformer[NO] ........ [OKAY][NO] -ninja .................. [OKAY] -....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -.............. [NO] .......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -JIT compiled ops requires ninja -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninja .................. [OKAY] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -op name ................ installed .. compatible --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -cpu_adam ............... [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -fused_adam ............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed general environment info: -fused_adam ............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -fused_lamb ............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -sparse_attn ............ [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -torch version .................... 1.8.1 -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -utils .................. [YES] ...... [OKAY] -torch version .................... 1.8.1 -quantizer .............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -...... [OKAY]quantizer -async_io ...............async_io [NO]............... .......[NO] [NO]....... - .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -torch version .................... 1.8.1 --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -ninja .................. [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -ninja .................. [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -ninja .................. [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- -op name ................ installed .. compatible -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninja .................. [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op name ................ installed .. compatible --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adam ............... [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -transformer ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.[NO] - -transformer_inference .. [NO] async_io....... [OKAY]............... - [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inference quantizer.. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninja .................. [OKAY].................. - [OKAY]-------------------------------------------------- - ---------------------------------------------------op name - ................ op nameinstalled .................. compatibleinstalled - --------------------------------------------------.. - compatible --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam...... [OKAY]............... - [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY]fused_adam - ............. [NO] fused_lamb....... .............[OKAY] -[NO] ....... fused_lamb[OKAY] -............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn transformer............ ............ [NO][NO] .............. [OKAY][OKAY] - -transformerstochastic_transformer ............ .[NO] [NO]....... ....... [OKAY][OKAY] - -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path -............... torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version .................... 1.8.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch cuda version ...............torch version 11.1.................... - nvcc version1.8.1 -..................... 11.2torch cuda version - deepspeed install path............... ...........11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']nvcc version - .....................deepspeed info 11.2................... - deepspeed install path0.4.2+bc17042, bc17042, big-science -...........deepspeed wheel compiled w. ......['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch 1.8, cuda 11.1deepspeed info - ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -DeepSpeed general environment info: -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -nvcc version ..................... 11.2 -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -ninja .................. [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -op name ................ installed .. compatible --------------------------------------------------- -torch version .................... 1.8.1 -cpu_adam ............... [YES] ...... [OKAY] -torch cuda version ............... 11.1 -fused_adam ............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -fused_lamb ............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -JIT compiled ops requires ninja -sparse_attn ............ [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adam ............... [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -DeepSpeed general environment info: -stochastic_transformer . [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -ninja .................. [OKAY] -transformer ............ [NO] ....... [OKAY] --------------------------------------------------- -stochastic_transformer . [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -stochastic_transformer . [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... transformer_inference[OKAY] -async_io ............... [NO] ....... [NO] -.. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -utils .................. [YES] ...... [OKAY] -.............. [NO] .......quantizer [OKAY].............. - [NO] ....... --------------------------------------------------[OKAY] - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -nvcc version ..................... 11.2 -async_io ............... [NO] ....... [NO] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -transformer_inference .. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -ninja .................. [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -fused_adam ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info:DeepSpeed general environment info: - - -/bin/sh: line 0: type: git: not found -torch install pathtorch install pathtorch install path ............................................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - -torch versiontorch versiontorch version ............................................................ 1.8.11.8.11.8.1 - - -/bin/sh: line 0: type: git: not found -torch cuda versiontorch cuda versiontorch cuda version ............................................. 11.111.111.1 - - -nvcc versionnvcc versionnvcc version ............................................................... 11.211.211.2 - - -/bin/sh: line 0: type: git: not found -deepspeed install pathdeepspeed install path deepspeed install path ........... ........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - - deepspeed wheel compiled w.deepspeed wheel compiled w....... ............torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. utils[NO] ......................... [YES][OKAY] -...... [OKAY] -utils quantizer.................. ..............[YES] [NO]...... .......[OKAY] -[OKAY] -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version ............... 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - -async_io ............... [NO] async_ioasync_io....... [NO].............................. - [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] ....... [OKAY] -transformer_inferencetransformer_inference .. ..[NO] utils [NO] ....... .................. ....... [OKAY] [YES] -[OKAY] -...... [OKAY] -utils utils..................quantizer ..................[YES].............. [YES]......[NO] ......[OKAY]....... - [OKAY][OKAY] - -quantizerquantizer --------------------------------------------------............................ - [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. utils[YES] ........................ [YES][OKAY] - ...... [OKAY] -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -DeepSpeed general environment info: -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -sparse_attn ............ [NO] ....... [OKAY] -torch cuda version ............... 11.1 -transformer ............ [NO] ....... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -stochastic_transformer . [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name ................ installed .. compatible --------------------------------------------------- -async_io ............... [NO] ....... [NO] -cpu_adam ............... [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed general environment info: --------------------------------------------------- -JIT compiled ops requires ninja -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -nvcc version ..................... 11.2 -async_io ............... [NO] ....... [NO] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer_inference .. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -torch cuda version ............... 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -JIT compiled ops requires ninja -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer ..............quantizer [NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -op name ................ installed .. compatible -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -cpu_adam ............... [YES] ...... [OKAY] -torch cuda version ............... 11.1 -fused_adam ............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -fused_lamb ............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -sparse_attn ............ [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -/bin/sh: line 0: type: git: not found -op name ................ installed .. compatible --------------------------------------------------- -/bin/sh: line 0: type: git: not found -cpu_adam ............... [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -fused_adam ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY]cpu_adam - ...............-------------------------------------------------- -[YES] ......op name ................[OKAY] -installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] cpu_adam....... ...............[OKAY] -[YES] ...... fused_lamb[OKAY] -............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attn fused_lamb............ .............[NO] [NO]....... .......[OKAY] -[OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer sparse_attn ............. [NO][NO] ....... .......[OKAY] -[OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -ninjafused_adam ............................... [OKAY][NO] - --------------------------------------------------....... - op name[OKAY] -................ installed fused_lamb.. compatible............. - --------------------------------------------------[NO] - ....... [OKAY] -DeepSpeed general environment info: -cpu_adam ............... [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -ninja .................. [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -op name ................ installed .. compatible -fused_adam .............transformer [NO] ................... [NO][OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -....... [OKAY]fused_lamb -cpu_adam ............... [YES] ...... [OKAY] - ............. [NO] ....... stochastic_transformer[OKAY] -fused_adam ............. [NO] ....... [OKAY] -. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -DeepSpeed general environment info: -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -/bin/sh: line 0: type: git: not found -op name ................ installed .. compatible --------------------------------------------------- -/bin/sh: line 0: type: git: not found -cpu_adam ............... [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -fused_adam ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info: -nvcc version ..................... 11.2 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch cuda version ............... 11.1 -DeepSpeed general environment info:torch install path -............... torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found - torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version torch version............... ....................11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -1.8.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version .....................torch cuda version 11.2............... - deepspeed install path11.1 -...........nvcc version .....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.2deepspeed info - deepspeed install path................... ...........0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... DeepSpeed general environment info:['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info ................... torch install path0.4.2+bc17042, bc17042, big-science -...............deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info: -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -async_io ............... [NO] ....... [NO] -nvcc version ..................... 11.2 -nvcc version ..................... 11.2 -transformer_inference .. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']DeepSpeed general environment info: - -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch install path torch cuda version............... ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version .....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -11.2 -deepspeed install pathtorch version ............................... 1.8.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -deepspeed infotorch cuda version .................................. 0.4.2+bc17042, bc17042, big-science11.1 - -deepspeed wheel compiled w.nvcc version ........................... torch 1.8, cuda 11.111.2 -nvcc version ..................... 11.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ............... 11.1 -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -torch cuda version ............... 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info: -nvcc version ..................... 11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version ............... 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -DeepSpeed general environment info: -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -nvcc version ..................... 11.2 -torch cuda version ............... 11.1 -quantizer .............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -transformer_inference .. [NO] ....... [OKAY] -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -using world size: 512, data-parallel-size: 16, tensor-model-parallel size: 4, pipeline-model-parallel size: 8 -/bin/sh: line 0: type: git: not found -using torch.float16 for parameters ... -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found ------------------------- arguments ------------------------ -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found - accumulate_allreduce_grads_in_fp32 .............. False -/bin/sh: line 0: type: git: not found - adam_beta1 ...................................... 0.9 - adam_beta2 ...................................... 0.999 - adam_eps ........................................ 1e-08 - adlr_autoresume ................................. False -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found - adlr_autoresume_interval ........................ 1000 - apply_query_key_layer_scaling ................... True - apply_residual_connection_post_layernorm ........ False - attention_dropout ............................... 0.1 - attention_softmax_in_fp32 ....................... False -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found - bert_binary_head ................................ True - bert_load ....................................... None - bf16 ............................................ False -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found - bias_dropout_fusion ............................. True - bias_gelu_fusion ................................ True - biencoder_projection_dim ........................ 0 - biencoder_shared_query_context_model ............ False - block_data_path ................................. None - checkpoint_activations .......................... True - checkpoint_in_cpu ............................... False - checkpoint_num_layers ........................... 1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - clip_grad ....................................... 1.0 - codecarbon_dir .................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/codecarbon - consumed_train_samples .......................... 0 - consumed_valid_samples .......................... 0 - contigious_checkpointing ........................ False - cpu_optimizer ................................... False - cpu_torch_adam .................................. False - data_impl ....................................... mmap - data_parallel_size .............................. 16 - data_path ....................................... ['/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document'] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: - dataloader_type ................................. single - DDP_impl ........................................ local - decoder_seq_length .............................. None - deepscale ....................................... False - deepscale_config ................................ None - deepspeed ....................................... True - deepspeed_activation_checkpointing .............. True - deepspeed_config ................................ ./ds_config.1188168.json - deepspeed_mpi ................................... False - distribute_checkpointed_activations ............. False - distributed_backend ............................. nccl - embedding_path .................................. None - encoder_seq_length .............................. 2048 - eod_mask_loss ................................... False - eval_interval ................................... 1000 - eval_iters ...................................... 5 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - evidence_data_path .............................. None - exit_duration_in_mins ........................... 1190 - exit_interval ................................... None - ffn_hidden_size ................................. 20480 - finetune ........................................ False - fp16 ............................................ True -torch version .................... 1.8.1 - fp16_lm_cross_entropy ........................... False - fp32_residual_connection ........................ False - global_batch_size ............................... 2048 - hidden_dropout .................................. 0.1 - hidden_size ..................................... 16384 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - hysteresis ...................................... 2 - ict_head_size ................................... None - ict_load ........................................ None - img_dim ......................................... 224 - indexer_batch_size .............................. 128 - indexer_log_interval ............................ 1000 - init_method_std ................................. 0.02 - init_method_xavier_uniform ...................... False - initial_loss_scale .............................. 4294967296 - kv_channels ..................................... 512 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - layernorm_epsilon ............................... 1e-05 - lazy_mpu_init ................................... None - load ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - local_rank ...................................... 0 - log_batch_size_to_tensorboard ................... True - log_interval .................................... 10 -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - log_learning_rate_to_tensorboard ................ True - log_loss_scale_to_tensorboard ................... True - log_num_zeros_in_grad ........................... False - log_params_norm ................................. False - log_timers_to_tensorboard ....................... True - log_validation_ppl_to_tensorboard ............... True - loss_scale ...................................... 12.0 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - loss_scale_window ............................... 1000 - lr .............................................. 6e-05 - lr_decay_iters .................................. None - lr_decay_samples ................................ 126953125 - lr_decay_style .................................. cosine - lr_warmup_fraction .............................. None - lr_warmup_iters ................................. 0 - lr_warmup_samples ............................... 216320 - make_vocab_size_divisible_by .................... 128 - mask_prob ....................................... 0.15 - masked_softmax_fusion ........................... True - max_position_embeddings ......................... 2048 - memory_centric_tiled_linear ..................... False - merge_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt - micro_batch_size ................................ 1 - min_loss_scale .................................. 1.0 - min_lr .......................................... 6e-06 - mmap_warmup ..................................... False - no_load_optim ................................... None - no_load_rng ..................................... None - no_save_optim ................................... None - no_save_rng ..................................... None -/bin/sh: line 0: type: git: not found - num_attention_heads ............................. 32 - num_channels .................................... 3 - num_classes ..................................... 1000 - num_layers ...................................... 32 - num_layers_per_virtual_pipeline_stage ........... None - num_workers ..................................... 2 - onnx_safe ....................................... None -/bin/sh: line 0: type: git: not found - openai_gelu ..................................... False - optimizer ....................................... adam - override_lr_scheduler ........................... False - params_dtype .................................... torch.float16 - partition_activations ........................... False -/bin/sh: line 0: type: git: not found - patch_dim ....................................... 16 - pipeline_model_parallel_size .................... 8 - position_embedding_type ......................... PositionEmbeddingType.absolute - profile_backward ................................ False - query_in_block_prob ............................. 0.1 - rampup_batch_size ............................... ['16', '16', '6_000_000'] -/bin/sh: line 0: type: git: not found - rank ............................................ 0 - remote_device ................................... none - reset_attention_mask ............................ False - reset_position_ids .............................. False - retriever_report_topk_accuracies ................ [] - retriever_score_scaling ......................... False - retriever_seq_length ............................ 256 -/bin/sh: line 0: type: git: not found - sample_rate ..................................... 1.0 - save ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - save_interval ................................... 1500 - scatter_gather_tensors_in_pipeline .............. True - scattered_embeddings ............................ False -/bin/sh: line 0: type: git: not found - seed ............................................ 42 - seq_length ...................................... 2048 - sgd_momentum .................................... 0.9 - short_seq_prob .................................. 0.1 - split ........................................... 949,50,1 - split_transformers .............................. False - synchronize_each_layer .......................... False -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - tensor_model_parallel_size ...................... 4 - tensorboard_dir ................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/tensorboard - tensorboard_log_interval ........................ 1 - tensorboard_queue_size .......................... 5 - tile_factor ..................................... 1 - titles_data_path ................................ None - tokenizer_name_or_path .......................... None - tokenizer_type .................................. GPT2BPETokenizer -/bin/sh: line 0: type: git: not found - train_iters ..................................... None - train_samples ................................... 300000000 - use_checkpoint_lr_scheduler ..................... False - use_contiguous_buffers_in_ddp ................... False - use_cpu_initialization .......................... None - use_one_sent_docs ............................... False - use_pin_memory .................................. False - virtual_pipeline_model_parallel_size ............ None - vocab_extra_ids ................................. 0 -/bin/sh: line 0: type: git: not found - vocab_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json - weight_decay .................................... 0.1 - world_size ...................................... 512 - zero_allgather_bucket_size ...................... 0.0 - zero_contigious_gradients ....................... False - zero_reduce_bucket_size ......................... 0.0 - zero_reduce_scatter ............................. False - zero_stage ...................................... 1 --------------------- end of arguments --------------------- -/bin/sh: line 0: type: git: not found -will use batch size rampup starting from global batch size 16 to global batch size 2048 with batch size increments 16 over 6000000 samples. -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] -> building GPT2BPETokenizer tokenizer ... --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:torch install path ............... - torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch cuda versiontorch version ................................... 11.11.8.1 -/bin/sh: line 0: type: git: not found - -nvcc version .....................torch cuda version 11.2............... - deepspeed install path11.1 -...........nvcc version .....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.2deepspeed info - deepspeed install path................... ...........0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. -......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed general environment info: -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -torch version .................... 1.8.1 -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -ninja .................. [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -ninjaninja .................. ..................[OKAY] - [OKAY]-------------------------------------------------- - -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] ---------------------------------------------------op name -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] -................ op nameinstalled .................. compatibleinstalled - --------------------------------------------------.. - compatible --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam ...... ...............[OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -[YES] ...... [OKAY] -fused_adam ............. [NO] ....... fused_adam[OKAY] - ............. fused_lamb[NO] .................... [NO] [OKAY]....... - [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -sparse_attntransformer ........................ [NO][NO] .............. [OKAY][OKAY] - -transformer stochastic_transformer............ [NO]. [NO]....... .......[OKAY] [OKAY] - -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninja .................. [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`................  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.[NO] - ....... - [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_iotransformer_inference ................. async_io[NO][NO] ............................. [NO][OKAY][NO] - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -....... [NO] -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -utils .................. [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer_inference transformer_inference.. ..[NO]quantizer .......[NO].............. [OKAY][NO]....... - .......[OKAY] -[OKAY] -utils--------------------------------------------------utils -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - -/bin/sh: line 0: type: git: not found ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -DeepSpeed general environment info: -cpu_adam ............... [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -fused_adam ............. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -fused_lamb ............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -sparse_attn ............ [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer ............ [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -stochastic_transformer . [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninja .................. [OKAY] -DeepSpeed general environment info: -torch cuda version ............... 11.1 -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -transformer_inference .. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -DeepSpeed general environment info: -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - ....................torch cuda version 1.8.1............... -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [YES] ...... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -quantizer .............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.-------------------------------------------------- - -fused_lamb ............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -async_io ............... [NO] ....... [NO] -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed general environment info: -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference .. [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -utils .................. [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -torch version .................... 1.8.1 -quantizer .............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -ninja .................. [OKAY] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -op name ................ installed .. compatible --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -DeepSpeed general environment info: -async_io ............... async_io[NO] ...................... [NO][NO] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -utils .................. utils[YES] ........................ [YES][OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -...... [OKAY] -quantizer .............. [NO] quantizer....... ..............[OKAY] -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -[NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] utils....... ..................[OKAY] -DeepSpeed general environment info: -[YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... -DeepSpeed general environment info:torch cuda version -............... 11.1 - [OKAY] -quantizer .............. --------------------------------------------------[NO] -nvcc versiontorch install path .................................... 11.2 - ....... [OKAY] --------------------------------------------------- -deepspeed install path ........... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info torch version................... ....................0.4.2+bc17042, bc17042, big-science -1.8.1deepspeed wheel compiled w. - ...... torch cuda versiontorch 1.8, cuda 11.1 -............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ---------------------------------------------------ninja - NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -..................-------------------------------------------------- -[OKAY]JIT compiled ops requires ninja - --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -DeepSpeed general environment info: -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:torch install path ............... - torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda versiontorch version ................................... 11.11.8.1 - -nvcc version .....................torch cuda version 11.2............... - 11.1deepspeed install path - ...........nvcc version ..................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.2 - -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -deepspeed infodeepspeed install path .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -............... torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ...............torch version ....................11.1 -1.8.1nvcc version - .....................torch cuda version 11.2............... - deepspeed install path11.1 -...........nvcc version .....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.2 -deepspeed info deepspeed install path................... ...........0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - ......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninja .................. [OKAY] --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -op name ................ installed .. compatible --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -cpu_adam ............... [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -fused_adam ............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -transformer_inference .. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -JIT compiled ops requires ninja -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -/bin/sh: line 0: type: git: not found -cpu_adam ............... [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -fused_adam ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -fused_lamb ............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -transformer ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference .. transformer_inference[NO] ....... [OKAY] -.. [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES]quantizer .................... [OKAY][NO] -DeepSpeed general environment info: - ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -ninja .................. [OKAY] --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -op name ................ installed .. compatible --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -cpu_adam ............... [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_adam ............. [NO] ....... [OKAY] -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -fused_lamb ............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -sparse_attn ............ [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -transformer ............ [NO] ....... [OKAY] -nvcc version ..................... 11.2 -stochastic_transformer . [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -/bin/sh: line 0: type: git: not found - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -ninja .................. [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - --------------------------------------------------- -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -op name ................ installed .. compatible --------------------------------------------------- -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -cpu_adam ............... [YES] ...... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -fused_adam ............. [NO] ....... [OKAY] -...... [OKAY] -DeepSpeed general environment info: -fused_lamb ............. [NO] ....... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- -torch install pathDeepSpeed general environment info: ............... -torch install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -sparse_attn ............ [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -............... torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version torch version............... ....................11.1 -1.8.1nvcc version -/bin/sh: line 0: type: git: not found -transformer ............ [NO] ....... [OKAY] - .....................torch cuda version 11.2............... - deepspeed install path11.1 -stochastic_transformer . [NO] ....... [OKAY] -...........nvcc version .....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.2deepspeed info - deepspeed install path................... ...........0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -ninja .................. [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -op name ................ installed .. compatible --------------------------------------------------- -torch cuda version ............... 11.1 -cpu_adam ............... [YES] ...... [OKAY] -nvcc version ..................... 11.2 -fused_adam ............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -fused_lamb ............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed general environment info: -transformer ............ [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -stochastic_transformer . [NO] ....... [OKAY] -torch version .................... 1.8.1 -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -DeepSpeed general environment info: -nvcc version ..................... 11.2 -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - .................... torch cuda version1.8.1 -............... 11.1torch cuda version - nvcc version............... .....................11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - ...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info - ...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - torch 1.8, cuda 11.1deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] -.. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......async_io [NO]............... -/bin/sh: line 0: type: git: not found - [NO] ....... [NO] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO]  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`........ [OKAY] - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -quantizer async_io.............. ...............[NO] [NO]....... .......[OKAY] -[NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -transformer_inference .. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -nvcc version ..................... 11.2 --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -transformer_inference .. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................ ................installed installed.. compatible.. - compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adam ...............cpu_adam [YES] ..................... [OKAY][YES] - ...... [OKAY] -fused_adam ............. fused_adam[NO] .................... [OKAY] -[NO] ....... fused_lamb[OKAY] ............. - [NO] .......fused_lamb [OKAY]............. - [NO] ....... [OKAY] -sparse_attn ............ sparse_attn[NO] ................... [OKAY][NO] - ....... transformer[OKAY] -............ [NO] ....... [OKAY]transformer - ............ [NO] stochastic_transformer....... [OKAY]. - [NO] ....... stochastic_transformer[OKAY] -. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -nvcc version ..................... 11.2 --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninja .................. [OKAY] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -op name ................ installed .. compatible --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -cpu_adam ............... [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -fused_adam ............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -ninja .................. [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -fused_adam ............. [NO] ....... [OKAY] -async_ioasync_io .............................. [NO] [NO]....... .......[NO] -[NO] -fused_lamb ............. [NO] ....... [OKAY] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -sparse_attn ............ [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] -transformer ............ [NO] ....... [OKAY] - -stochastic_transformer . [NO] ....... [OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2DeepSpeed general environment info: -deepspeed install path ........... - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed infotorch install path ................... ...............0.4.2+bc17042, bc17042, big-science - deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -ninjaninjaninja .................................... ..................[OKAY][OKAY] -[OKAY] - -utils .................. [YES] ...... [OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -op nameop nameop name ................................................ installedinstalledinstalled ...... compatiblecompatiblecompatible - -/bin/sh: line 0: type: git: not found --------------------------------------------------- -/bin/sh: line 0: type: git: not found - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -cpu_adam fused_adamfused_adam .......................... [NO]...............[NO] .......[YES]....... [OKAY]......[OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - -[OKAY]fused_lamb - fused_lamb............. .............[NO] [NO]....... .......[OKAY] -[OKAY]fused_adam - ............. [NO] ....... [OKAY] -sparse_attn sparse_attn............fused_lamb .........................[NO] [NO][NO]....... .......[OKAY] -[OKAY] - .......transformertransformer [OKAY]............ -............ [NO][NO] .............. [OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -DeepSpeed general environment info: -sparse_attn ............ [NO] ....... [OKAY] -DeepSpeed general environment info: -transformer ............ [NO] ....... [OKAY] -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -stochastic_transformer . [NO] ....... [OKAY] -1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -DeepSpeed general environment info: -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -DeepSpeed general environment info: -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -torch version .................... 1.8.1 -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -torch cuda version ............... 11.1 -....... [OKAY] --------------------------------------------------- -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path -............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... torch version1.8.1 -.................... torch cuda version1.8.1 -............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... [NO]....... - [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... utils[OKAY] -.................. [YES] ...... quantizer[OKAY] -.............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch cuda version ............... 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda version ............... 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -nvcc version ..................... 11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1DeepSpeed general environment info: - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO]transformer_inference ......... [NO][NO] - ....... [OKAY] -utils .................. [YES]transformer_inference ........ [OKAY][NO] - ....... [OKAY]quantizer - .............. [NO] ....... [OKAY]utils - .................. [YES] --------------------------------------------------...... - [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path ............... - torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ...............torch version 11.1.................... - nvcc version1.8.1 -..................... 11.2torch cuda version - deepspeed install path............... ...........11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']nvcc version - deepspeed info..................... ...................11.2 -0.4.2+bc17042, bc17042, big-sciencedeepspeed install path - deepspeed wheel compiled w............ ...... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch 1.8, cuda 11.1 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed infoDeepSpeed general environment info: ................... -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1torch install path - ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ................... ................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> setting codecarbon ... -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam [YES] ..................... [YES][OKAY] -...... [OKAY] -fused_adam ............. fused_adam[NO] .................... [OKAY][NO] - ....... [OKAY] -fused_lamb ............. fused_lamb[NO] .................... [NO][OKAY] -....... [OKAY] -sparse_attn ............ sparse_attn[NO] ................... [NO][OKAY] -....... [OKAY] -transformer transformer............ ............[NO] [NO]....... .......[OKAY] -[OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - -> initializing torch distributed ... -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path ............... - torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version .................... 1.8.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch cuda versiontorch version ................................... 11.11.8.1 - -nvcc version torch cuda version..................... ...............11.2 -11.1deepspeed install path - nvcc version........... ..................... 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - deepspeed info...... ...................torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ------------------------------------------------------------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - - op nameop nameop name................ ................................installed ................ installed installed ..installed .... compatible compatible.. - -compatible ---------------------------------------------------------------------------------------------------- -compatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam cpu_adam...............cpu_adam .............................. [YES] ............... [YES][YES] ...... [YES]......[OKAY]...... - [OKAY] ...... -[OKAY] -[OKAY] -fused_adam ............. [NO] .......fused_adamfused_adam fused_adam [OKAY] .......................... - ............. [NO] fused_lamb [NO][NO] ....... .................... [OKAY]....... [NO][OKAY] -[OKAY] - -fused_lamb.......fused_lamb [OKAY] fused_lamb - ............. ............. ............. [NO] [NO] [NO]....... ..............[OKAY] - [OKAY][OKAY] -sparse_attn - ............ [NO] ....... [OKAY] -transformer ............ sparse_attn[NO]sparse_attnsparse_attn ........................................... [OKAY] -[NO] .......[NO] stochastic_transformer[NO] [OKAY].............. - . [OKAY] [OKAY] -transformer[NO] - transformer............transformer....... ............[OKAY]............[NO] - [NO].......[NO] .......[OKAY]....... - [OKAY][OKAY] -stochastic_transformer - stochastic_transformer. stochastic_transformer[NO] . . ....... [NO] [NO] [OKAY].............. - [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path DeepSpeed general environment info:...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch install path - 1.8.1............... -torch version torch cuda version.................... ...............1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.1 -torch cuda version - nvcc versiontorch version ............... ......................................... 1.8.111.111.2 - - -nvcc versiondeepspeed install pathtorch cuda version ............................................... 11.2 -11.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathnvcc version deepspeed info........... ........................................ ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.20.4.2+bc17042, bc17042, big-science - - -deepspeed infodeepspeed install path deepspeed wheel compiled w. ................... ........... ...... 0.4.2+bc17042, bc17042, big-science -torch 1.8, cuda 11.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed wheel compiled w. - deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - - op nameop nameop name ................ ................................ ................ installed installedinstalledinstalled .... .. .. compatiblecompatiblecompatiblecompatible - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -cpu_adam cpu_adamcpu_adam............... cpu_adam...............[YES]............... ............... [YES][YES] ...... [YES]...... ...... [OKAY] ......[OKAY] -[OKAY] -[OKAY] - -fused_adamfused_adam .............fused_adam............. fused_adam [NO][NO]............. ............. ....... .......[NO][NO][OKAY] -.......[OKAY]....... - fused_lamb[OKAY][OKAY] -fused_lamb -............. .............[NO] fused_lamb fused_lamb[NO] ....... ............. .................... [OKAY] [NO] -[NO] [OKAY] ....... -....... [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY]sparse_attn - ............ transformer[NO]sparse_attn sparse_attn............................... [OKAY]............[NO] -[NO] [NO] ....... transformer....... ....... [OKAY] ............[OKAY][OKAY] - - -[NO]transformer .......stochastic_transformer............ transformer[OKAY] -.[NO]............ [NO].......[NO]stochastic_transformer [OKAY].............. - .[OKAY] -[OKAY][NO]stochastic_transformer - ....... .[OKAY]stochastic_transformer - [NO] ........ [NO][OKAY] ....... - [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [OKAY] - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninja ......................................................ninja [OKAY][OKAY][OKAY].................. - - - [OKAY]------------------------------------------------------------------------------------------------------------------------------------------------------ - - - -op name--------------------------------------------------op nameop name - ................................op name................ installed................installedinstalled ..installed.... compatible..compatiblecompatible - - -compatible-------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adam cpu_adam...............cpu_adamcpu_adam ...............[YES].............................. [YES]......[YES][YES] [OKAY].................. - [OKAY][OKAY][OKAY] - - -fused_adam ............. fused_adamfused_adam[NO] fused_adam............. ............. ....... ............. [NO][NO] [OKAY] -.......[NO]....... [OKAY].......[OKAY]fused_lamb - -[OKAY]............. - fused_lamb[NO]fused_lamb .............fused_lamb.................... [NO].............[OKAY][NO] ....... -....... [NO] [OKAY] [OKAY] -....... - [OKAY] -sparse_attn ............ [NO]sparse_attn .......sparse_attn............ [OKAY]............sparse_attn[NO] - [NO]................... .......transformer[NO][OKAY] [OKAY] -................... - [NO][OKAY] transformer.......transformer - ............[OKAY]............ - [NO][NO]transformer .......................... stochastic_transformer [OKAY] [OKAY][NO] - - ........ [NO][OKAY] stochastic_transformer -stochastic_transformer....... [OKAY] -..stochastic_transformer [NO][NO] ............... [OKAY][NO] - [OKAY]....... - [OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................................... ....................................[OKAY][OKAY] - -[OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -op nameop name op nameop name................................ ................installedinstalled................ ..installed.. installed compatiblecompatible .. - -.. -------------------------------------------------- --------------------------------------------------compatible -compatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam ...............cpu_adam............... cpu_adam [YES] ............... [YES]..................... ......[YES][OKAY][YES] - [OKAY]............ - [OKAY][OKAY] - -fused_adam .............fused_adam [NO]............. fused_adam.......[NO]fused_adam [OKAY]................................. - [OKAY][NO]fused_lamb [NO] - ....... ............. fused_lamb .......[OKAY][NO] -.............[OKAY]....... -[NO][OKAY]fused_lamb - .......fused_lamb............. [OKAY].............[NO] - [NO]....... .......[OKAY] -[OKAY] -sparse_attn ............ [NO] .......sparse_attn [OKAY]............ - [NO] .......transformersparse_attn sparse_attn ............[OKAY] - ........................[NO]transformer ............[NO].......[NO] [NO] [OKAY] ..................... - [OKAY][OKAY][OKAY] - - -stochastic_transformer stochastic_transformer.transformertransformer [NO]........................ . ....... [NO][NO][NO] [OKAY]..................... - [OKAY][OKAY] -[OKAY] - -stochastic_transformer stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................................... .................. .................. [OKAY][OKAY] [OKAY] -[OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name -op name op name op name................ ................................installed................ installedinstalled.. installed .... compatible .. - compatiblecompatible-------------------------------------------------- -compatible - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... cpu_adamcpu_adam[YES]cpu_adam ............... .................................... [YES][OKAY][YES][YES] - ...... ...... ...... [OKAY][OKAY] - -[OKAY] -fused_adam ............. [NO] ....... [OKAY]fused_adam -fused_adamfused_adam fused_lamb....................................... .............[NO][NO] [NO] [NO]..................... [OKAY].......[OKAY] -[OKAY] - -[OKAY]fused_lamb -fused_lamb fused_lamb ............. ............. ............. [NO] [NO] [NO] ....... ....... ....... [OKAY]sparse_attn[OKAY] - -[OKAY]............ - [NO] ....... [OKAY] -transformer ............ [NO]sparse_attn sparse_attn sparse_attn....... ............ ............ ............[OKAY][NO][NO] - [NO].............. stochastic_transformer.......[OKAY][OKAY] - -[OKAY]transformer. -transformer ............[NO]............transformer [NO] .......[NO] ............ [OKAY].............. -[NO][OKAY][OKAY] - -....... [OKAY] -stochastic_transformerstochastic_transformer stochastic_transformer. . [NO][NO]. .............. [OKAY][NO] - [OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninjaJIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report - - --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY] [OKAY] - -[OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name ---------------------------------------------------op name op name -................ op name ................................installed installed..................installed ..compatible ..compatibleinstalled - - compatible----------------------------------------------------------------------------------------------------.. - - -compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam .............................. cpu_adam [YES]cpu_adam[YES] ...... ............... ..................... [OKAY][OKAY][YES][YES] - - ............ [OKAY][OKAY] - -fused_adam fused_adam............. [NO]............. .......[NO] fused_adam [OKAY]fused_adam ....... -............. fused_lamb[OKAY]............. -[NO] .............[NO]....... fused_lamb .......[NO] [OKAY] .................... -[OKAY] -[OKAY][NO]fused_lamb - fused_lamb.................... [OKAY].............[NO] - [NO]....... sparse_attn.......[OKAY] - ............[OKAY] [NO] - sparse_attn....... ............[OKAY] -[NO] ....... [OKAY]transformer -sparse_attn ............ sparse_attn ............ transformer[NO] [NO]............ ............ ....... [NO] .......[OKAY] -[NO]....... [OKAY] ....... -stochastic_transformer [OKAY] [OKAY] -transformer -. ............[NO]transformer stochastic_transformer.......[NO]............ [OKAY] ........ -[NO] [NO][OKAY] -.............. [OKAY][OKAY] -stochastic_transformer - . [NO] stochastic_transformer....... [OKAY]. - [NO] ....... [OKAY] --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - - --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - -op name -op name op nameop name................ installed................................................ installed.. installedinstalled.. compatiblecompatible -.... --------------------------------------------------- -------------------------------------------------- -compatible -compatible - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam ...............cpu_adam............... cpu_adam [YES] [YES] ............... ..................... ...... [YES][OKAY][OKAY][YES] -...... -...... [OKAY][OKAY] - -fused_adam fused_adam............. .............[NO]fused_adam fused_adam [NO] .................... ............. ....... [OKAY][NO] [NO] - .......[OKAY].......fused_lamb - .............[OKAY][OKAY] - -[NO]fused_lamb .................... fused_lamb [OKAY]fused_lamb - [NO] ............. ............. ....... [NO] [NO] [OKAY] ....... -....... [OKAY][OKAY] -sparse_attn - ............ [NO] ....... [OKAY] -transformer ............ sparse_attn[NO]sparse_attn sparse_attn ............ ............................... [NO][NO][NO][OKAY] ....... ....... -.......[OKAY] [OKAY] -[OKAY] - -stochastic_transformer transformertransformertransformer . ............ ............[NO]............ .......[NO][NO][NO] [OKAY]..................... - [OKAY][OKAY][OKAY] - - -stochastic_transformerstochastic_transformer stochastic_transformer . .[NO] . [NO] ....... [NO].......[OKAY] - .......[OKAY] -[OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils quantizer.................. ..............[YES] [NO]...... .......[OKAY] -[OKAY] -quantizer-------------------------------------------------- -.............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninja ..................ninja .................. .................. [OKAY] [OKAY][OKAY].................. - - - ----------------------------------------------------------------------------------------------------[OKAY] --------------------------------------------------- - - ---------------------------------------------------op nameop name -op name op name................................ ................ installed ................ ..installed installedinstalled ..compatible.. .. -compatible compatible-------------------------------------------------- - -compatible - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam...............cpu_adam............... [YES]..............................[YES] [YES]......[YES]...... [OKAY] ...... -......[OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] fused_adam.......fused_adam fused_adam .............[OKAY]............. .............[NO] -[NO] [NO] fused_lamb ....... .............. ............. [OKAY][NO][OKAY] [OKAY] - -....... - [OKAY]fused_lambfused_lamb -fused_lamb ....................................... [NO][NO][NO] ..............sparse_attn ....... [OKAY][OKAY]............ - -[OKAY] [NO] -ninjaninjaninja ninja ...................................................... .................. [OKAY] [OKAY][OKAY][OKAY] - ....... [OKAY] - - - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -transformer sparse_attn............sparse_attn sparse_attn ............ [NO]........................ [NO]....... [NO] .......[NO].......[OKAY] -op nameop name op name................op name................ ................installed................installed installed..installed.. ..compatiblecompatible.. - -[OKAY] -.......[OKAY]stochastic_transformer -[OKAY] transformer - --------------------------------------------------compatible ---------------------------------------------------compatible - - --------------------------------------------------- --------------------------------------------------- - .transformer............ [NO]transformer............ ............ ....... [NO][NO][NO] .......[OKAY] .............. -cpu_adam ...............cpu_adam [YES]............... cpu_adam......cpu_adam [YES] ............... ...............[OKAY] ...... -[YES][OKAY][YES] -[OKAY] - [OKAY][OKAY] - - ............ [OKAY][OKAY] - -stochastic_transformer stochastic_transformerstochastic_transformer. .[NO]. [NO].......[NO] ..............[OKAY] -[OKAY][OKAY] - -fused_adam ............. fused_adam[NO] .................... [NO]fused_adam[OKAY] - fused_adam.................... fused_lamb[OKAY] -............. [NO] .............fused_lamb [NO] ....... [NO].................... .......[NO][OKAY] [OKAY] [OKAY] - -....... - [OKAY]fused_lambfused_lamb - .......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attn ............sparse_attn [NO]............ .......[NO] [OKAY]....... - [OKAY] -transformersparse_attn ............transformersparse_attn ............[NO]............ [NO]...................[NO] [OKAY].......[NO] - ....... ....... [OKAY] [OKAY] -stochastic_transformer[OKAY] - -transformerstochastic_transformer . ............transformer . [NO][NO]............[NO] ..............[NO] ....... [OKAY] [OKAY] -[OKAY]....... - - stochastic_transformer[OKAY] -. [NO]stochastic_transformer ....... [OKAY]. - [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja -JIT compiled ops requires ninja --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY][OKAY] - -[OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------op name - - op nameop name................ op name................................ installedinstalled................installed .. installed....compatible -compatible..--------------------------------------------------compatible - - ---------------------------------------------------compatible-------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam cpu_adam......cpu_adam ...............[OKAY] ............... - [YES]...............[YES] ......[YES]...... [OKAY][OKAY]...... - -fused_adam [OKAY]............. - [NO] ....... [OKAY] -fused_adamfused_adamfused_lamb .............fused_adam............. ............. [NO][NO] ............. [NO].............. [OKAY][NO].......[OKAY] - -.......[OKAY] -[OKAY]fused_lamb - .............fused_lamb fused_lamb[NO]............. sparse_attn[NO].................... [OKAY] ................... - [NO] [NO].......[OKAY] -.......[OKAY] -[OKAY] -transformersparse_attn ........................ [NO][NO] sparse_attn ....... ....... ............ [OKAY]sparse_attn -[OKAY] [NO] -............ stochastic_transformer .......[NO]transformer. ...................[OKAY][NO] [OKAY] -[NO] - .............. transformer[OKAY] [OKAY]transformer -............ - ............[NO] [NO]....... stochastic_transformer.......[OKAY] -[OKAY]. - [NO]stochastic_transformer .......stochastic_transformer .[OKAY] -.[NO] [NO]....... .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO]............... .......[NO] [NO]....... - [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] - - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -op nameop nameop nameop name ................................................................ installedinstalledinstalledinstalled ........ compatible compatiblecompatible -compatible - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -cpu_adam cpu_adam...............cpu_adamcpu_adam ..............................[YES]............... [YES]......[YES][YES] ......[OKAY]............ - [OKAY][OKAY] -[OKAY] - -fused_adam .............fused_adamfused_adam [NO].............fused_adam............. ....... [NO]............. [NO] [OKAY] .......[NO] -....... [OKAY].......[OKAY] - -fused_lamb[OKAY] -.............fused_lamb fused_lamb [NO] .......................... fused_lamb....... [NO] [NO] .............[OKAY]....... - [NO].......[OKAY] -.......[OKAY] -[OKAY] -sparse_attn ............ [NO]sparse_attn sparse_attn ....... ............ ............sparse_attn[OKAY] -[NO][NO]............ ..............transformer[NO] [OKAY] [OKAY]............ -....... - [NO][OKAY] transformer -....... transformer ............ [OKAY]transformer ............ - [NO]............ .......[NO][NO] [OKAY]..............stochastic_transformer - [OKAY][OKAY] - -.stochastic_transformer [NO] stochastic_transformer....... . stochastic_transformer[OKAY].[NO] - [NO]........ .......[OKAY][NO] - [OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -ninjaninjaninjaninja .................. .................................... ..................[OKAY] - [OKAY][OKAY][OKAY] --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op name --------------------------------------------------- -................ -op nameop name op name ................installed ................ ..................installedinstalled compatible..installed.. - --------------------------------------------------compatible.. - -compatible -------------------------------------------------- -compatible - --------------------------------------------------- --------------------------------------------------- -cpu_adam ...............cpu_adam [YES]...............cpu_adam cpu_adam ......[YES] ............... ...............[OKAY] ...... - [YES] [YES][OKAY] - ............ [OKAY][OKAY] - -fused_adam ............. [NO] fused_adam....... .............[OKAY] -fused_adam[NO]fused_adam fused_lamb................................. [OKAY] -.............[NO][NO] [NO] fused_lamb....... ....... ....................[OKAY] -[NO][OKAY][OKAY] - -fused_lamb....... [OKAY]fused_lamb............. - .............[NO] [NO]....... .......[OKAY] -[OKAY] -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] -[NO] .......transformer sparse_attn[OKAY] sparse_attn ............ - ............ transformer............ [NO] [NO] ............[NO] ....... .......[NO][OKAY]....... - .......[OKAY][OKAY] - -stochastic_transformer[OKAY] -transformer. transformerstochastic_transformer............ [NO] ............ [NO]........ [NO] [NO]....... [OKAY] -..............[OKAY] -[OKAY][OKAY] - -stochastic_transformer stochastic_transformer. [NO] ........ [NO][OKAY] -....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... [OKAY] -.................. [YES] ......utils [OKAY] -.................. [YES] ...... [OKAY] -quantizer ..............quantizer [NO].............. [NO] ....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninjaninjaninjaninja .................. .................. ..................[OKAY].................. [OKAY] -quantizer .............. [NO] ....... [OKAY] -[OKAY][OKAY] --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op name - --------------------------------------------------- --------------------------------------------------- op name -................op name op name installed................ ................installed ................ installed.... installed ..compatiblecompatible.. - - compatible--------------------------------------------------compatible-------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam ...............cpu_adamcpu_adam............... [YES]..............................[YES] [YES]...... ......[YES] ...... [OKAY] ......[OKAY] -[OKAY] - -[OKAY] -fused_adam .............fused_adam fused_adamfused_adam [NO] ....................................... ....... [NO][NO] [NO][OKAY] -.............. .......[OKAY] fused_lamb -[OKAY][OKAY]............. - - [NO]fused_lamb .......fused_lambfused_lamb............. [NO][OKAY] ............. -.................... [NO][NO][OKAY] -.............. [OKAY][OKAY] -sparse_attn - ............ [NO] ....... [OKAY] -sparse_attntransformer ........................ sparse_attn [NO]sparse_attn [NO] ....... ............................... [OKAY][NO][OKAY][NO] - - ..............transformer stochastic_transformer ............[OKAY] [OKAY] -. -[NO] [NO]transformer.......transformer ...............................[OKAY] - [NO][OKAY][NO] -stochastic_transformer ....... ....... [OKAY]. -[OKAY] -[NO] stochastic_transformer....... stochastic_transformer[OKAY]. -.[NO] [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ------------------------------------------------------------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] -.. [NO] ....... utils[OKAY] -.................. [YES] ...... utils[OKAY] -.................. [YES] quantizer...... ..............[OKAY] -[NO] ....... quantizer[OKAY] -.............. [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ---------------------------------------------------JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- -op nameop name - op name ................ ................op name................ installedinstalled................installed ....installed.. compatible compatible.. - -compatible ---------------------------------------------------------------------------------------------------- -compatible - - --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adamcpu_adam ..............................cpu_adam cpu_adam[YES][YES] .................................... ...... [YES][YES][OKAY] -......[OKAY]...... - [OKAY][OKAY] - -async_io ............... [NO] ....... [NO] -fused_adam fused_adam............. fused_adam.............[NO] .............fused_adam[NO]....... [NO] .................... [OKAY]....... - [OKAY][NO][OKAY] - -transformer_inference .. [NO] ....... [OKAY] -.......fused_lamb [OKAY]fused_lambfused_lamb............. -utils .................. [YES] ...... [OKAY] - ..........................[NO] fused_lamb[NO] [NO] .................... ....... .......[NO][OKAY] [OKAY] -....... -quantizer .............. [NO] ....... [OKAY] -[OKAY] -[OKAY] --------------------------------------------------- -sparse_attnsparse_attnsparse_attn sparse_attn............ ........................ [NO] ............ [NO] .......[NO] [NO] .......[OKAY] ....... -....... [OKAY] [OKAY]transformer[OKAY] - - -............transformer transformertransformer [NO] ........................................... [NO] [NO][NO][OKAY] -..................... [OKAY][OKAY][OKAY] - - -stochastic_transformer .stochastic_transformer stochastic_transformerstochastic_transformer [NO] .......... [OKAY][NO][NO][NO] - ....... ....... ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -[OKAY][OKAY] - -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY] [OKAY] - -[OKAY] ----------------------------------------------------------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- -op nameop name - op name................................op name ................installedinstalled ................installed .... installedcompatiblecompatible.. - - -------------------------------------------------- ..--------------------------------------------------compatible - - -compatible ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam [YES]...............cpu_adam cpu_adam ......[YES] ............... [OKAY] .....................[YES] - [YES][OKAY]...... - ......[OKAY] -[OKAY] -fused_adam ............. [NO] ....... fused_adam[OKAY] fused_adam -............. fused_adam.............fused_lamb[NO] [NO]................................. .......[NO][NO][OKAY] -[OKAY].............. - [OKAY]fused_lamb[OKAY] - fused_lamb -............. fused_lamb[NO]............. ....................[NO] [NO][OKAY]....... - [OKAY].......sparse_attn - ............[OKAY] -[NO] ....... [OKAY] -sparse_attn transformer............ ............sparse_attn[NO] ............[NO]....... sparse_attn .......[NO] [OKAY] ............ -[OKAY]....... - transformer[NO][OKAY] -stochastic_transformer................... transformer [NO].[OKAY] ....... -[NO]............[OKAY] -.......transformer[NO] [OKAY]............ -.......stochastic_transformer [NO][OKAY] -........ [NO][OKAY]stochastic_transformer -....... [OKAY] -.stochastic_transformer [NO] ........ [OKAY][NO] - ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninja ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name - op nameop nameop name ................ ................ ................................ installedinstalled installed installed...... compatible..compatiblecompatible - - -compatible------------------------------------------------------------------------------------------------------------------------------------------------------ - - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adamcpu_adam ............... .............................. ............... [YES][YES][YES][YES] ........................ [OKAY][OKAY] [OKAY] - -[OKAY] - -fused_adamfused_adam fused_adamfused_adam ............. .......................... ............. [NO] [NO] [NO][NO] ....... ....... ....... [OKAY] [OKAY].......[OKAY] - - -[OKAY] -fused_lambfused_lambfused_lamb fused_lamb .......................... ............. ............. [NO][NO] [NO] [NO] ....... .............. ....... [OKAY] [OKAY][OKAY][OKAY] - - - -sparse_attnsparse_attnsparse_attnsparse_attn .................................... ............ [NO][NO][NO][NO] ....... ..................... [OKAY] [OKAY][OKAY] -[OKAY] - - -transformer transformertransformertransformer............ ....................................[NO] [NO][NO][NO]....... ..............[OKAY] -.......[OKAY][OKAY] - -[OKAY] -stochastic_transformerstochastic_transformer stochastic_transformerstochastic_transformer . .[NO] ..[NO]....... [NO].......[NO] [OKAY] [OKAY] -.............. - [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................. .................. [OKAY]..................[OKAY] - -[OKAY][OKAY]-------------------------------------------------- --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op nameop name - - op name................op name................ ................................ installedinstalledinstalled installed .. .... .. compatiblecompatible compatible - -compatible ------------------------------------------------------------------------------------------------------------------------------------------------------- - - - --------------------------------------------------- -cpu_adamcpu_adam ...............cpu_adam...............cpu_adam ...............[YES]...............[YES] ......[YES] [YES]...... [OKAY] ...... -...... [OKAY][OKAY] - -[OKAY] -fused_adam ............. [NO]fused_adam fused_adam....... ............. ............. fused_adam[OKAY][NO] -.......[NO]............. [OKAY]fused_lamb.......[NO] - ............. [OKAY]fused_lamb ....... -[NO] .............[OKAY]....... - [NO]fused_lamb fused_lamb[OKAY] ....... -.......................... [NO][OKAY][NO] - .............. [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............ transformer[NO] ................... [NO]sparse_attnsparse_attn [OKAY] -............................... transformer[OKAY][NO] - [NO]................... .......stochastic_transformer[OKAY][NO] - [OKAY]....... -. transformer[OKAY][NO] -transformer ............ .......stochastic_transformer[NO]............ [NO][OKAY]........ - .......[OKAY][NO] - [OKAY]....... - [OKAY]stochastic_transformer - stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info - ...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -DeepSpeed general environment info: -...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. [OKAY] .................. - [OKAY] [OKAY] -[OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name op name -op name................op name ................................installed................ ..installed installed installed .. compatible.. -..compatible -------------------------------------------------- -compatiblecompatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam ...............cpu_adam[YES] cpu_adam ...............[YES] ...........................[YES] [YES][OKAY][OKAY] ...... - -...... [OKAY] -[OKAY] -fused_adamfused_adam .............fused_adam............. [NO] .............fused_adam [NO] .......[NO] ....................[OKAY] ....... [OKAY][NO] - - [OKAY]....... -fused_lamb fused_lamb [OKAY] ............. - .............fused_lamb[NO] fused_lamb[NO] .................... ............. ....... [OKAY][NO][NO] - [OKAY] ....... - .......[OKAY] -[OKAY] -sparse_attn ............sparse_attn sparse_attn [NO] ............sparse_attn ............................... [NO][OKAY][NO][NO] - .....................transformer [OKAY][OKAY][OKAY] -............ - - [NO]transformer transformer transformer....... ............ ............ ............[OKAY] [NO][NO] -[NO] ....... ....... ....... stochastic_transformer [OKAY] [OKAY][OKAY] - - -. [NO] stochastic_transformer.......stochastic_transformerstochastic_transformer [OKAY]. -.. [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] - - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -op nameop name -op name ................................op name................ installed................installedinstalled ....installed .. compatible ..compatible -compatible-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------compatible - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam ............... ............... ...............cpu_adam [YES] [YES] ...............[YES]............ [YES]......[OKAY][OKAY] -[OKAY] -...... - [OKAY] -fused_adam fused_adam.............fused_adam ............. ............. fused_adam[NO][NO][NO] .................................. [OKAY] [NO] -[OKAY][OKAY] - -.......fused_lamb fused_lamb [OKAY]fused_lamb............. -............. .............[NO] fused_lamb [NO][NO] ....... ............. ....... .......[OKAY] -[NO][OKAY][OKAY] - -....... [OKAY] -sparse_attnsparse_attn sparse_attn ............ ............ sparse_attn............ [NO] [NO] [NO]............ ....... ....... ....... [OKAY][NO][OKAY][OKAY] - - -.......transformer transformertransformer[OKAY]............ ............ - ............[NO][NO]transformer ....... .......[NO]............ .......[OKAY][NO][OKAY] - - [OKAY]....... -stochastic_transformer [OKAY]stochastic_transformer -. stochastic_transformer [NO] . stochastic_transformer ........ [NO] [OKAY][NO]........ - .......[OKAY][NO] - [OKAY]....... - [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - ....................torch cuda version 1.8.1............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed info deepspeed wheel compiled w.................... ......0.4.2+bc17042, bc17042, big-science -torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninja----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja -ninjaninjaninja ninja...................................................... [OKAY] [OKAY][OKAY] - -.................. - ----------------------------------------------------------------------------------------------------[OKAY]-------------------------------------------------- - - -op name - op name--------------------------------------------------op name................ -................ installedop name................installed ....................installed compatible -compatibleinstalled..-------------------------------------------------- - ---------------------------------------------------..compatible - -compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam cpu_adam............... ...............[YES] [YES]cpu_adam...... ......cpu_adam............... [OKAY][OKAY] - -[YES]............... [YES]...... ......[OKAY] [OKAY]fused_adam -fused_adam - .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_adam fused_adam............. fused_lambfused_lamb .......................... [NO] .............[NO] [NO].......[NO]....... [OKAY]..............[OKAY] - -[OKAY][OKAY] - -ninjaninjaninja ninja .................................... .................. ..................[OKAY][OKAY][OKAY] - -[OKAY] - -fused_lamb .............fused_lamb [NO]............. .......[NO] .......sparse_attn[OKAY] sparse_attn [OKAY] --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -............ - ............[NO] [NO]....... .......[OKAY] -[OKAY] -op nameop name op nameop name ................ ................................................ installedinstalledinstalledinstalled .. .... ..compatiblecompatiblecompatible - -compatible ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -transformertransformer sparse_attn ........................sparse_attn............ [NO] [NO]............[NO] [NO]..................... .......[OKAY][OKAY][OKAY] - - -[OKAY] --------------------------------------------------- -transformerstochastic_transformer stochastic_transformer............transformer . [NO]. ............ [NO][NO]....... [NO] .............. [OKAY] .......[OKAY] -cpu_adamcpu_adam ..............................cpu_adam cpu_adam [YES] ...............[YES] ...............[YES]............ ...... [YES][OKAY] - [OKAY][OKAY] - - -[OKAY][OKAY]...... - - [OKAY] -stochastic_transformer stochastic_transformer. [NO] . .......[NO] .......[OKAY] -fused_adam ............. [NO] .......fused_adam fused_adam[OKAY]fused_adam -[OKAY] -.......................... ............. [NO][NO] fused_lamb [NO]........................... [OKAY]....... [OKAY] - [NO] - [OKAY]....... - [OKAY]fused_lambfused_lamb -fused_lamb ....................................... [NO][NO][NO] ....... ....... ....... sparse_attn[OKAY] [OKAY] - [OKAY] -............ - [NO] ....... [OKAY] -transformer ............ [NO]sparse_attn sparse_attnsparse_attn ....... ............ ........................[OKAY] -[NO][NO][NO] .......stochastic_transformer .............. [OKAY] -[OKAY].[OKAY] - -transformer[NO] ............transformer transformer....... [NO] ............ ...................[NO][OKAY] [NO] -[OKAY]....... - .......[OKAY] -[OKAY] -stochastic_transformer stochastic_transformer .stochastic_transformer . [NO] [NO]........ [OKAY].......[NO] - [OKAY]....... - [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] -.. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info:deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch install pathtorch 1.8, cuda 11.1 - ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizer .............. [NO] ....... [OKAY] -torch version .................... 1.8.1 --------------------------------------------------- -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -op name -op name op name op name................................................ ................installed installedinstalled.. installed....compatible .. -compatible compatible-------------------------------------------------- - -compatible --------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- -cpu_adam ...............cpu_adamcpu_adam cpu_adam[YES].............................. ...... ...............[YES] [YES] [OKAY] [YES]............ - ......[OKAY][OKAY] - -[OKAY] -fused_adam ............. [NO]fused_adamfused_adam fused_adam............. ....... ............. .............[OKAY] [NO] -[NO] [NO] ....... ....... ....... [OKAY] fused_lamb[OKAY][OKAY] - - -.............fused_lamb fused_lamb [NO]fused_lamb ................................. ............. [NO][OKAY] -[NO][NO] ..................... [OKAY] -[OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY]sparse_attn -sparse_attn sparse_attn transformer............ ............ ............[NO] ............[NO] [NO] ....... .......[NO] [OKAY]....... -.......[OKAY][OKAY] transformer - - [OKAY]............ -transformer [NO]............ stochastic_transformer transformer....... [NO] . ............ [OKAY].......[NO] - .......[NO] [OKAY] [OKAY] -.......stochastic_transformer - stochastic_transformer [OKAY] -.. [NO]stochastic_transformer[NO] ............... [OKAY][OKAY][NO] - - ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] .......utils [NO].................. - [YES] ...... [OKAY] -quantizer .............. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................................... .................. .................. [OKAY] [OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- - -op nameop nameop name op name ................................................................ installedinstalled installed .. installed.... .. compatiblecompatiblecompatiblecompatible - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -cpu_adamcpu_adamcpu_adamcpu_adam ............... .............................. ............... [YES][YES] [YES][YES] ........................ [OKAY] [OKAY] -[OKAY][OKAY] - - -fused_adamfused_adam fused_adamfused_adam ............. ..........................[NO] ............. [NO].......[NO] .......[OKAY].......[NO] -[OKAY] -.......[OKAY] fused_lamb -[OKAY] fused_lamb -............. .............fused_lamb[NO] [NO]fused_lamb.................... ....... .............[OKAY][NO] - [OKAY][NO]....... - .......[OKAY] -[OKAY] -sparse_attn sparse_attn............ ............[NO]sparse_attnsparse_attn ............[NO]................... .......[OKAY][NO][NO] - [OKAY].............. - transformer[OKAY][OKAY] -transformer - ............transformer............ transformer [NO] ............[NO] ............ ....... [NO]....... [NO] [OKAY] [OKAY] -.............. - [OKAY][OKAY] - -stochastic_transformerstochastic_transformer stochastic_transformer.stochastic_transformer. .[NO][NO] ........[NO]....... [NO][OKAY].......[OKAY] -....... - [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils - .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY]async_io - async_io............... -------------------------------------------------- ............... -[NO] [NO]....... .......[NO] -[NO] -transformer_inference ..transformer_inference [NO] ......... [NO][OKAY] -....... [OKAY] -DeepSpeed general environment info: -utils .................. utils[YES] ........................ [OKAY][YES] - ...... [OKAY] -DeepSpeed general environment info:torch install path ............... -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc versiontorch cuda version .................................... 11.211.1 - -deepspeed install path nvcc version........... ..................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.2 - -deepspeed infodeepspeed install path .............................. 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - deepspeed info...... ...................torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninja .................................... [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -op name op name................ installed................ ..installed compatible -..-------------------------------------------------- -compatible --------------------------------------------------- -cpu_adam ............... cpu_adam[YES] ..................... [OKAY][YES] - ...... [OKAY] -fused_adam ............. fused_adam[NO] .................... [OKAY] -[NO] ....... [OKAY]fused_lamb - ............. [NO] fused_lamb....... [OKAY]............. - [NO] ....... [OKAY] -sparse_attn ............ [NO]sparse_attn ....... ............[OKAY] - [NO] .......transformer [OKAY]............ - [NO] transformer....... ............[OKAY] - [NO] ....... stochastic_transformer[OKAY] -. [NO] .......stochastic_transformer [OKAY] -. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] [OKAY] - - -[OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop nameop name ................ ................................ ................installed installed installedinstalled.... ..compatible .. -compatible compatible--------------------------------------------------compatible - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adam[YES] cpu_adam ............... ............... .....................[YES] [YES][OKAY][YES]...... - [OKAY]............ - [OKAY][OKAY] - -fused_adam fused_adam.............fused_adam fused_adam............. [NO] ............. [NO]............. .......[NO] ....... [NO][OKAY]....... - .......[OKAY][OKAY] - -[OKAY]fused_lamb - .............fused_lambfused_lamb fused_lamb[NO].......................... ....................[NO][NO] [OKAY][NO]....... - ....... [OKAY] ....... -/bin/sh: line 0: type: git: not found -[OKAY] -[OKAY] -sparse_attn ............sparse_attn [NO] sparse_attnsparse_attn................... [NO]............ [OKAY] ............ -.......[NO] [NO][OKAY].......transformer - ...................[OKAY] transformer -[OKAY][NO] - ...................transformertransformer [OKAY] [NO] -............ ............ ....... [NO] [NO] [OKAY] ....... -....... stochastic_transformer [OKAY] -[OKAY]stochastic_transformer - . [NO]stochastic_transformer. stochastic_transformer.......[NO] .[OKAY]....... -. [NO][OKAY][NO] - .............. [OKAY][OKAY] - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................................... ..................[OKAY] .................. [OKAY][OKAY] - - ---------------------------------------------------[OKAY]-------------------------------------------------- --------------------------------------------------- -op name - - --------------------------------------------------................ -op nameop name op name ................installed................ installed................installed.. .. ..installed compatiblecompatible - -compatible..---------------------------------------------------------------------------------------------------- - - -compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam ..............................cpu_adam cpu_adam[YES][YES]............... ...........................[YES] [YES][OKAY]...... [OKAY] - -......[OKAY] -[OKAY] -fused_adamfused_adam ..........................fused_adam fused_adam[NO].............[NO] .................... [NO] ....... [NO][OKAY] ....... - [OKAY].......[OKAY] - -[OKAY]fused_lamb - .............fused_lamb fused_lamb fused_lamb[NO] .......................... ....................[NO][NO] [OKAY].............. - [NO] [OKAY] [OKAY] - -....... [OKAY] -sparse_attn ............ [NO]sparse_attn .......sparse_attn ............ sparse_attn[OKAY] ............ -............[NO] [NO][NO] transformer.............. .......[OKAY][OKAY]............ - - [OKAY]transformer[NO] - transformer................... transformer............[NO][OKAY] -.......[NO]............ stochastic_transformer[NO][OKAY] -....... [OKAY]........ - stochastic_transformer [OKAY] [NO] - stochastic_transformer....... . stochastic_transformer .[OKAY] [NO] - [NO]........ .......[NO][OKAY] [OKAY] -....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... transformer_inference[NO] -.. [NO] ....... [OKAY] -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -quantizer .............. [NO]utils ......................... [OKAY][YES] - ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -async_io ............... [NO] ....... [NO] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -transformer_inference .. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name -op name op name................ op name................................installed ................installed..installed installed..compatible.. - ..compatiblecompatible-------------------------------------------------- - - -compatible---------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... cpu_adamcpu_adamcpu_adam[YES] ................................................... [YES][YES][YES][OKAY] -.................. [OKAY][OKAY][OKAY] - - -fused_adam ............. [NO]fused_adam fused_adamfused_adam ....... .......................... ............. [OKAY][NO][NO] -[NO] ..............fused_lamb....... [OKAY][OKAY].............[OKAY] - - -[NO] ....... fused_lambfused_lambfused_lamb [OKAY] ............. -.......................... [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] - - -sparse_attn ............ [NO] .......sparse_attn [OKAY]sparse_attnsparse_attn............ - ........................[NO] [NO]transformer [NO] ....... ................... ....... [OKAY] [OKAY] -[OKAY][NO] - -transformer.......transformer transformer[OKAY]........................ - ............ [NO] [NO] stochastic_transformer[NO] ....... ....... ....... [OKAY] .[OKAY] -[OKAY] - -[NO] ....... stochastic_transformerstochastic_transformer[OKAY] stochastic_transformer - .. . [NO] [NO] [NO] ....... .............. [OKAY][OKAY][OKAY] - - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... utils[OKAY] -.................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -.............. [NO] quantizer....... ..............[OKAY] -[NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] async_io....... [NO]............... - [NO] .......async_io [NO] -............... [NO] ....... [NO]transformer_inference - .. [NO] ....... [OKAY] -transformer_inference .. transformer_inferenceutils[NO] ........................... [NO][YES] [OKAY]............. - [OKAY][OKAY] - -utils ..................quantizer [YES]utils.............. ........................[NO] [OKAY][YES]....... - ......[OKAY] -[OKAY]quantizer - .............. [NO]-------------------------------------------------- -.......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -/bin/sh: line 0: type: git: not found - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version ...............torch cuda version 11.1............... - nvcc version11.1 -.....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -................... deepspeed info0.4.2+bc17042, bc17042, big-science -................... deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - torch 1.8, cuda 11.1deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - -op name -op name op nameop name ................ ................................ ................ installedinstalledinstalledinstalled ........ compatible compatible -compatible -compatible ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- - -cpu_adamcpu_adam cpu_adam..............................cpu_adam [YES]...............[YES]............... ......[YES]...... [YES] [OKAY] [OKAY]............ - - [OKAY][OKAY] - -fused_adamfused_adam .......................... fused_adamfused_adam[NO] [NO] .................... ............. ....... [NO][OKAY] [NO] -.......[OKAY]....... - [OKAY][OKAY]fused_lamb - - fused_lamb............. .............fused_lamb[NO] fused_lamb....... [NO] ............. .............[OKAY]....... -[NO][NO][OKAY] -.............. [OKAY][OKAY] - -sparse_attn ............ [NO]sparse_attn ................... sparse_attnsparse_attn[OKAY] -[NO]........................ transformer.......[NO][NO] ............ [OKAY]....... - .......[NO]transformer[OKAY] -.......[OKAY]............ [OKAY]transformer -[NO] - .......transformer............ stochastic_transformer [NO] [OKAY] ............ -........ [NO][NO][OKAY] -..............stochastic_transformer [OKAY][OKAY]stochastic_transformer. - - [NO] ........stochastic_transformer [NO] [OKAY] -........ [NO][OKAY] -ninjaninjaninjaninja .................. .................. .................................... [OKAY][OKAY] [OKAY] - -[OKAY] --------------------------------------------------- -....... [OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- -op name - - op nameop name................op name installed................................................ installedinstalled..installed .. ....compatible -compatiblecompatible--------------------------------------------------compatible - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adam cpu_adam...............cpu_adam cpu_adam...............[YES] ..............................[YES] ......[YES]......[YES] ......[OKAY]...... [OKAY] -[OKAY] -[OKAY] - -fused_adam fused_adam............. fused_adam.............fused_adam .............[NO] [NO]............. ....... [NO] ....... .......[OKAY][NO] - [OKAY][OKAY]....... - - fused_lamb[OKAY] -fused_lamb.............fused_lamb .............fused_lamb.............[NO] [NO][NO].................... ..............[NO] [OKAY] [OKAY] -....... -[OKAY] [OKAY] - -sparse_attn ............ [NO]sparse_attn .......sparse_attnsparse_attn ............ ............[OKAY]............ [NO][NO] -[NO] ..............transformer....... [OKAY]............[OKAY][OKAY] - - -[NO] transformertransformer ....... ............transformer............ [OKAY] [NO]............ -[NO]....... [NO][OKAY].......stochastic_transformer - ....... .[OKAY][OKAY] - -[NO]stochastic_transformer ....... .[OKAY]stochastic_transformer stochastic_transformer - [NO] ......... [OKAY][NO][NO] - .............. [OKAY][OKAY] - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -DeepSpeed general environment info:DeepSpeed general environment info: - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -DeepSpeed general environment info: -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info: -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - -op name op name................op name................ installed................installed ................ ..installed.. compatiblecompatibleinstalled.. - - --------------------------------------------------..--------------------------------------------------compatible -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - - -compatible-------------------------------------------------- - --------------------------------------------------- -DeepSpeed general environment info: -cpu_adam cpu_adam............... ...............[YES] cpu_adam cpu_adam[YES]...... .............................. ...... [OKAY][YES][OKAY][YES] - - ............ [OKAY][OKAY] - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -fused_adamfused_adam .......................... fused_adam[NO]fused_adam [NO]....... [OKAY] ............. -torch version .................... 1.8.1 -.................... fused_lamb[OKAY] [NO][NO] -torch cuda version ............... 11.1 - ............. ....... .......fused_lamb [NO] [OKAY][OKAY].................... - - [OKAY][NO]fused_lamb -nvcc version ..................... 11.2 - fused_lamb.................... .............[NO][OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -[NO]....... .......[OKAY] -[OKAY]sparse_attn - ............ [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -sparse_attn transformer............ sparse_attn............[NO]sparse_attn ............[NO]....... ............[NO] [OKAY]....... [NO] -....... [OKAY] -transformer[OKAY]....... -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - ............[OKAY]stochastic_transformertransformer - [NO]. ...................transformer ............[NO] [NO][OKAY] [NO] -....... ..............[OKAY] - stochastic_transformer[OKAY][OKAY] - -. [NO]stochastic_transformer stochastic_transformer....... .[OKAY]. - [NO][NO] .............. [OKAY][OKAY] - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -transformer_inference .. [NO] ....... [OKAY] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -[OKAY] -quantizer .............. utils[NO] ......................... [YES][OKAY] ---------------------------------------------------JIT compiled ops requires ninja - - ---------------------------------------------------JIT compiled ops requires ninja - -...... [OKAY] --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY] - -[OKAY][OKAY]-------------------------------------------------- --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op name - - op name................op nameop name ................installed................................ installed ..installed installed ..compatible.... - compatible--------------------------------------------------compatiblecompatible - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adam ............... [YES]cpu_adam cpu_adamcpu_adam...... ...............[OKAY]............... ............... - [YES] [YES] [YES] ...... ...... ...... [OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] ....... fused_adam[OKAY]fused_adam - fused_adam .......................... [NO]fused_lamb.............[NO] ....... .............[NO].......[OKAY] -[NO][OKAY] -.......fused_lamb....... [OKAY].............[OKAY]fused_lamb - - [NO]............. [NO]fused_lamb....... ....................[OKAY] -[OKAY][NO] - ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attnsparse_attn transformer........................ ............ sparse_attn[NO] [NO] [NO] ................... ....... ....... [NO][OKAY][OKAY] - -.......[OKAY] -transformertransformer[OKAY] -........................stochastic_transformer transformer [NO][NO]............. ..............[NO][NO] [OKAY].......[OKAY]....... - - [OKAY][OKAY] -stochastic_transformer -stochastic_transformer .. stochastic_transformer [NO] [NO] .............. . [OKAY] [OKAY] -[NO] - ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -JIT compiled ops requires ninja--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer .............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -JIT compiled ops requires ninja----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... transformer_inference[NO] ......... [NO][NO] -....... [OKAY] -utils .................. [YES]transformer_inference ........ [OKAY][NO] - ....... [OKAY]quantizer - .............. [NO] ....... [OKAY]utils - .................. [YES]-------------------------------------------------- -...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja ...................................................... [OKAY][OKAY][OKAY] - -.................. --------------------------------------------------- ---------------------------------------------------------------------------------------------------- - -[OKAY]op name -op name - op name ................ ................--------------------------------------------------................ installed installedinstalled - .. .. ..op name compatiblecompatible - compatible -----------------------------------------------------------------------------------------------------................ - - - --------------------------------------------------installed - .. compatible -cpu_adam-------------------------------------------------- cpu_adam -...............cpu_adam ...............[YES]............... [YES]......[YES] ......[OKAY]...... - [OKAY]cpu_adam[OKAY] - -............... [YES] ...... [OKAY]fused_adam - ............. fused_adam[NO] fused_adam ............. ....... ............. [NO] [OKAY] [NO] -....... .......fused_adam[OKAY]fused_lamb - [OKAY].......................... -fused_lamb [NO]fused_lamb[NO] ............. ....... ....................[NO][OKAY] [NO] - [OKAY].............. - [OKAY][OKAY] - -fused_lamb ............. [NO] ....... sparse_attn[OKAY] -............ [NO] ....... sparse_attnsparse_attn[OKAY] -........................ [NO]transformer[NO] .......................... sparse_attn [OKAY][OKAY] - [NO] - transformer................... ............transformer[OKAY][NO] - [NO]................... stochastic_transformer .......[NO][OKAY] -[OKAY]........ - [OKAY][NO]transformer - stochastic_transformer................... stochastic_transformer [OKAY].[NO] - . [NO] [NO].............. ....... [OKAY][OKAY] -[OKAY] - -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] - -[OKAY] ----------------------------------------------------------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op nameop name - - ................op name................op name installed ................installed ................ .. .. installedinstalledcompatible compatible - .. -..-------------------------------------------------- -------------------------------------------------- - -compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam [YES]............... cpu_adamcpu_adam......[YES] ....................................[OKAY] -[YES][OKAY][YES] - ............ [OKAY][OKAY] - -fused_adam ............. [NO] fused_adam....... .............[OKAY] -[NO]fused_adamfused_adam fused_lamb.................... [OKAY].............[NO] - ............. [NO]fused_lamb.......[NO] ........................... [OKAY] [NO] [OKAY] -[OKAY] -....... - [OKAY]fused_lamb - fused_lamb............. .............[NO] [NO]....... .......sparse_attn[OKAY] -[OKAY]............ - sparse_attn[NO] ................... [NO][OKAY] -....... [OKAY] -transformersparse_attn transformer............sparse_attn [NO] ........................ ............ .......[NO][NO] .......[OKAY][NO] -....... [OKAY]....... -[OKAY]stochastic_transformer - [OKAY]stochastic_transformer - .transformer . [NO]transformer ............ [NO] ................... [NO]....... [OKAY] [NO] [OKAY] - -.............. [OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO] [NO]....... .......[OKAY] [OKAY] - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -JIT compiled ops requires ninja ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -JIT compiled ops requires ninja-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -async_io ............... [NO] ....... [NO] -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -op nameop nameop name op name ................ ................................ ................ installed installedinstalled installed .. .... .. compatible compatible -compatiblecompatible - --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -cpu_adamcpu_adam cpu_adamcpu_adam............... .............................................[YES] [YES]......[YES][YES] ......[OKAY]............ - [OKAY] [OKAY] -[OKAY] - -fused_adam .............fused_adam fused_adam fused_adam[NO] ............. ............. ............. .......[NO] [NO] [NO] [OKAY] .............. -....... [OKAY][OKAY][OKAY] - -fused_lamb - .............fused_lamb fused_lambfused_lamb [NO] ............. .......................... ....... [NO] [NO][NO] [OKAY] ....... -.............. [OKAY][OKAY][OKAY] - - -sparse_attn ............ [NO]sparse_attn sparse_attnsparse_attn....... ........................[OKAY]............ -DeepSpeed general environment info: - [NO][NO][NO] .....................transformer [OKAY][OKAY][OKAY]............ - - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [NO] transformer.......transformer transformer ............ [OKAY] ........................ -[NO] [NO][NO]....... ..............stochastic_transformer[OKAY] -torch version .................... 1.8.1 -[OKAY][OKAY] - -. [NO] .......stochastic_transformerstochastic_transformerstochastic_transformer [OKAY] -torch cuda version ............... 11.1 -... [NO][NO] [NO] .............. .......[OKAY][OKAY] - -[OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................................... .................................... [OKAY] [OKAY] -[OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op nameop name - -op nameop name................................ ................ installed ................ installed ..installedinstalled .. .. ..compatible compatible -compatible - -------------------------------------------------------------------------------------------------------------------------------------------------------compatible - - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam ...............cpu_adam ............... [YES].............................. ...... [YES] [YES][OKAY][YES] - .................. [OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] ....... fused_adam[OKAY]fused_adamfused_adam - ............. ............. ............. fused_lamb[NO] .............[NO] [NO][NO] ....... ....... .......[OKAY] - ....... [OKAY][OKAY]fused_lamb -[OKAY] - -............. [NO] .......fused_lambfused_lamb [OKAY].......................... - sparse_attn[NO][NO] ................... .......[NO][OKAY] -.......[OKAY]sparse_attn -[OKAY] -............ [NO] transformer....... ............sparse_attn[OKAY] -[NO]............sparse_attn transformer....... [NO] ............[OKAY] -............ ....... [NO] stochastic_transformer[NO] [OKAY] ....... -........ transformer[OKAY][NO][OKAY] ....... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -[OKAY]............ -async_io ............... [NO] ....... [NO] -transformer [NO] stochastic_transformer................... [NO].[OKAY] -.......[NO] [OKAY]....... - [OKAY] -transformer_inference .. [NO] ....... [OKAY] -stochastic_transformer stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -quantizer .............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja - - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report ------------------------------------------------------------------------------------------------------------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................. ......................................................[OKAY] -[OKAY][OKAY][OKAY]-------------------------------------------------- - - - ---------------------------------------------------op name---------------------------------------------------------------------------------------------------- - - -................op name op nameop nameinstalled................ installed.................. ................ ..compatible -installed --------------------------------------------------compatible..installed - - compatible--------------------------------------------------.. - - --------------------------------------------------compatible -DeepSpeed general environment info: -cpu_adam - ...............-------------------------------------------------- cpu_adam[YES] - cpu_adam..................... ...............[YES][OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [YES]...... ......[OKAY]cpu_adam - ............... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -fused_adam[YES] ............. ......[NO] fused_adam [OKAY] fused_adam....... -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -............. .............[NO][OKAY] -[NO]....... .......[OKAY]fused_lamb - [OKAY]............. - fused_adam[NO]fused_lambfused_lamb ....... ....................................... [OKAY][NO][NO][NO] - ..................... [OKAY] [OKAY] -[OKAY] - -sparse_attn fused_lamb............ .............[NO] [NO]....... sparse_attnsparse_attn .......[OKAY] ............ - ............ transformer[NO][NO][OKAY] ............ -.............. [OKAY][NO][OKAY] - -....... [OKAY]transformertransformer - ........................ [NO][NO]stochastic_transformer .......sparse_attn....... . [OKAY] ............ -[OKAY][NO] -.......[NO]stochastic_transformer [OKAY]stochastic_transformer -........ . [OKAY] [NO] -[NO] .............. transformer [OKAY] [OKAY] - -............ [NO] ....... [OKAY] -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - -stochastic_transformer . [NO] ....... [OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op name op name op name................................................ ................installedinstalledinstalled installed...... compatible..compatiblecompatible - - -compatible---------------------------------------------------------------------------------------------------- --------------------------------------------------- - - --------------------------------------------------- -cpu_adamcpu_adamcpu_adam cpu_adam............................................. ...............[YES][YES][YES] ......[YES]............ ......[OKAY][OKAY] [OKAY] - -[OKAY] - -fused_adamfused_adam fused_adamfused_adam ............. ............. [NO].............[NO]............. .......[NO][NO]....... [OKAY].......[OKAY] - ....... -[OKAY] -[OKAY]fused_lamb -fused_lamb .............fused_lamb............. [NO]fused_lamb[NO]............. ...........................[NO] [NO][OKAY][OKAY]....... - - .......[OKAY] -[OKAY] -sparse_attnsparse_attn ........................sparse_attn sparse_attn[NO][NO]............ ...................[NO]....... [OKAY][NO][OKAY]....... - - .......[OKAY] -transformertransformer[OKAY] -........................transformer [NO]transformer[NO]............ ...................[NO]....... [NO][OKAY].......[OKAY] - -.......[OKAY] -[OKAY] -stochastic_transformerstochastic_transformer stochastic_transformerstochastic_transformer.. [NO][NO]. ...............[NO] [NO][OKAY][OKAY]....... - - .......[OKAY] -[OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name -op nameop name................ op name................ installed ................................ installed .. ..installed installed compatible compatible -.... --------------------------------------------------- ---------------------------------------------------compatiblecompatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adam [YES]cpu_adam ............... ............... .....................[YES] [YES][OKAY]......[YES] - [OKAY]............ - [OKAY][OKAY] - -fused_adam ............. fused_adam[NO]fused_adam fused_adam ................................. [NO][NO] [OKAY] ....... -............. ....... [OKAY] fused_lamb -[NO][OKAY] -............. fused_lamb.......[NO] fused_lamb ............. .......[OKAY] ............. [NO] - [OKAY] -[NO]fused_lamb....... ....................[OKAY] [NO] - .......[OKAY] -[OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformersparse_attnsparse_attn sparse_attn............ ............ ............[NO]............[NO] ..............[NO] [NO] [OKAY][OKAY].............. - - [OKAY][OKAY] - -transformer stochastic_transformer............ transformertransformer [NO] . ............ ...................[NO] [OKAY].......[NO][NO] -[OKAY] ....... -....... stochastic_transformer[OKAY][OKAY] - -. [NO] .......stochastic_transformerstochastic_transformer [OKAY] -.. [NO][NO] .............. [OKAY][OKAY] - -ninjaninjaninjaninja .................................... .................................... [OKAY][OKAY] [OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop name op name ................ ................ ................................ installedinstalled installed ....installed .. compatiblecompatible .. - -compatible--------------------------------------------------compatible-------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam cpu_adam...............cpu_adam............... ...............[YES] [YES] ...............[YES]...... ......[OKAY]......[YES] - [OKAY][OKAY]...... - - [OKAY] -fused_adam .............fused_adam fused_adam[NO]fused_adam ............. .......................... [NO]....... [NO][OKAY][NO] -....... ..............fused_lamb[OKAY] - .............[OKAY][OKAY] - -fused_lamb[NO] fused_lamb.............fused_lamb....... [NO] [OKAY].......................... -....... [NO][NO][OKAY] -.............. [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY]sparse_attn - ............ sparse_attnsparse_attn [NO]transformer ........................................... [NO] [NO][OKAY][NO] -.............. transformer[OKAY].......[OKAY] - - ............[OKAY] transformerstochastic_transformer - [NO] .............transformer....... [NO] [NO][OKAY] ............ - ....... [NO].......stochastic_transformer[OKAY] - .......[OKAY] -.[OKAY] -[NO]stochastic_transformer .......stochastic_transformer . [OKAY] . -[NO] [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ------------------------------------------------------------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... ..................[OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name ---------------------------------------------------op name -op name................ op name................................installed ..................installedinstalled installed.. compatible.. compatible -DeepSpeed general environment info: -compatible -..-------------------------------------------------- - -------------------------------------------------- --------------------------------------------------- -compatible -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -cpu_adamcpu_adamcpu_adam cpu_adam ............... .............................. ............... [YES][YES] [YES] [YES] ............ [OKAY][OKAY]...... ...... -[OKAY] - -[OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -fused_adam fused_adam.............fused_adam fused_adam[NO]............. .................... ............. [NO][NO] [NO] [OKAY] ..................... -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - [OKAY][OKAY][OKAY]fused_lamb - - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - ............. fused_lambfused_lambfused_lamb[NO] .............................................. [NO] [NO][OKAY][NO]....... - .......[OKAY]....... -[OKAY] -[OKAY] -/bin/sh: line 0: type: git: not found -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn sparse_attnsparse_attn............transformer ............[NO] ............[NO]................... [NO][NO][OKAY]....... - ..............[OKAY] transformer - [OKAY][OKAY]............ - -/bin/sh: line 0: type: git: not found - [NO]transformer transformerstochastic_transformer................... [OKAY]............[NO] -. [NO]....... [NO]stochastic_transformer....... [OKAY].......[OKAY] -. -[OKAY] -[NO]stochastic_transformer stochastic_transformer....... .[OKAY] . -[NO] [NO]....... .......[OKAY] -[OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninja - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaJIT compiled ops requires ninja - - --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -ninjaninjaninjaninja ........................................................................ [OKAY] - [OKAY][OKAY] -[OKAY]-------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- -op name-------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninjaninjaninja ninja .................................... .................. .................. [OKAY][OKAY] -[OKAY][OKAY] - --------------------------------------------------- - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name - op nameop name................ op name ................ ................installed ................ installedinstalledinstalled .. .. .. .. compatiblecompatible compatible - -compatible ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -op name op name op name................ ................ ................ installed ................installed installed .. ..installed .. compatible ..compatible -compatible -------------------------------------------------- -compatible --------------------------------------------------- - --------------------------------------------------- - --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - --------------------------------------------------- -cpu_adam cpu_adamcpu_adam ............... ...............cpu_adam ............... [YES] [YES][YES]..................... ......[YES]......[OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -cpu_adam cpu_adam............... cpu_adam[YES]............... cpu_adam ...............[YES]...... ............... [OKAY][YES]......[YES] - ......[OKAY]...... - [OKAY]...... -[OKAY] -[OKAY] - [OKAY] -[OKAY] -fused_adam ............. fused_adam[NO] fused_adam.................... fused_adam.............[OKAY][NO] ............. -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -fused_adam ............. [NO] fused_adam....... .............[OKAY]fused_adam fused_adam -[NO]....... [NO].......fused_lamb [OKAY] ............. -.......[OKAY] -[NO][OKAY]fused_lamb - .............[NO] .............fused_lamb....... [NO] [NO] [OKAY]............. ....... -....... .............[OKAY] fused_lamb[NO] - .......[NO] fused_lamb [OKAY][OKAY]....... - - .............[OKAY] -fused_lamb ................................. [NO][OKAY][NO] -....... .......[OKAY] - [OKAY] -[NO]fused_lamb fused_lamb.................... .............[OKAY][NO] - [NO].......sparse_attn .......[OKAY]............ -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] - [OKAY][NO] - ....... [OKAY] -[NO] ....... sparse_attn[OKAY] transformer............ -sparse_attn transformer............ ............[NO] [NO]....... sparse_attn.......[OKAY]sparse_attn - ............[OKAY]............ transformer - sparse_attn............[NO] ............[NO].......transformer ............[NO].......[OKAY] -[NO][OKAY]....... - .......transformer[OKAY] - [NO] [NO] ............ .......stochastic_transformer ....... [OKAY] [NO] - .[OKAY]....... - [NO][OKAY] -stochastic_transformer[OKAY]............ -[NO]transformer. .......stochastic_transformer............[NO] [OKAY][NO]........ -transformertransformer....... ............[OKAY] -stochastic_transformer............[NO] [NO]........ .......[OKAY][NO] - [NO].......[OKAY] -.......[OKAY]stochastic_transformer - [OKAY] - [OKAY]....... - [OKAY] -. [NO]stochastic_transformer ....... [OKAY]. -stochastic_transformer stochastic_transformer. [NO] ........ [NO][OKAY] - [NO] ....... [OKAY] -....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY][OKAY] -[OKAY] - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name -op name op name ................op name ................ ................installed................ installed installed installed ...... ..compatiblecompatiblecompatible - - -------------------------------------------------------------------------------------------------------------------------------------------------------compatible - - - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam ..............................cpu_adam............... [YES] [YES][YES] ............... .................. [YES] [OKAY][OKAY] - -[OKAY]...... - [OKAY] -fused_adamfused_adamfused_adam .............fused_adam ............. [NO] ............. .............[NO]....... [NO].......[OKAY][NO] -....... [OKAY] ....... -[OKAY] fused_lamb -[OKAY] -fused_lamb.............fused_lamb fused_lamb [NO]....................................... .......[NO][NO][NO] .............. [OKAY]....... -[OKAY] [OKAY][OKAY] - - -sparse_attnsparse_attnsparse_attn ............sparse_attn............ ........................ [NO][NO][NO][NO] ............................ [OKAY][OKAY][OKAY] - -[OKAY] - -transformer ............transformertransformer transformer [NO]............ ............ [NO]............ ....... [NO]....... [OKAY][NO] [OKAY] - ....... -....... [OKAY]stochastic_transformerstochastic_transformer[OKAY] - -.. stochastic_transformer[NO]stochastic_transformer[NO] ............... .[OKAY][NO][OKAY] - -[NO]....... [OKAY] -....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................. .................. .................................... [OKAY] [OKAY] -[OKAY][OKAY]-------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------op name-------------------------------------------------- - - - op nameop name................ op name ................................installed ................ installed..installed installed.. compatible ....compatible - -compatiblecompatible-------------------------------------------------- --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam cpu_adam...............cpu_adam............... ...............[YES]...............[YES] ............[YES] [YES] [OKAY] [OKAY]............ - - [OKAY][OKAY] - -fused_adamfused_adam fused_adam..........................fused_adam [NO] [NO].......................... [NO] .............. [NO] [OKAY][OKAY]....... - -.......[OKAY] -fused_lamb[OKAY] fused_lamb............. -fused_lamb fused_lamb[NO].......................... .......[NO].............[NO] [OKAY]....... -[NO][OKAY]....... - .......[OKAY] -[OKAY] -sparse_attnsparse_attn ............sparse_attn............ sparse_attn [NO][NO] ............ .......................... [OKAY][NO][NO][OKAY] - - .............. transformer[OKAY]transformer[OKAY] - -........................ transformer [NO]transformer [NO] ................... ............ .......[OKAY] [NO] -[NO][OKAY] - ..............stochastic_transformer [OKAY]stochastic_transformer[OKAY] - -. .[NO] stochastic_transformer[NO]stochastic_transformer ....... ........ . [OKAY] [NO][OKAY] - - [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc versionDeepSpeed general environment info: .......................................... -11.211.2 - -deepspeed install pathdeepspeed install path torch install path...................... ...............['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']0.4.2+bc17042, bc17042, big-science - - -deepspeed wheel compiled w. deepspeed wheel compiled w....... torch version ...... torch 1.8, cuda 11.1 .................... -torch 1.8, cuda 11.1 -1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -ninjaninjaninja ninja ...................................................... [OKAY] [OKAY].................. - - [OKAY]--------------------------------------------------[OKAY]-------------------------------------------------- - - - -op name-------------------------------------------------- op name ---------------------------------------------------................ op name -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -................installed ................installedop name .. installed.. ................ compatible..compatible - - installed----------------------------------------------------------------------------------------------------compatible - - -..-------------------------------------------------- -compatible -cpu_adam-------------------------------------------------- cpu_adam -............... ...............cpu_adam[YES] [YES]..................... ......[YES][OKAY] -[OKAY]cpu_adam...... - [OKAY]............... - [YES] fused_adam...... .............fused_adam[OKAY] -fused_adam[NO]............. ....................[NO] [NO].......[OKAY] ....... -[OKAY] -[OKAY] -fused_lambfused_lambfused_lamb fused_adam .......................... .............[NO].............[NO] [NO]....... ..............[NO][OKAY] -[OKAY] [OKAY] - -....... [OKAY] -fused_lamb ............. sparse_attn[NO] sparse_attn............sparse_attn ............[NO]................... [NO]....... [NO] [OKAY] [OKAY].............. - - [OKAY][OKAY] - -transformer transformer............transformer ............[NO]............ .......[NO][NO] [OKAY].............. - sparse_attn[OKAY][OKAY] - -stochastic_transformer............ stochastic_transformerstochastic_transformer [NO]. . .[NO] ....... [NO][NO] .......[OKAY].............. - [OKAY][OKAY][OKAY] - - -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ...................DeepSpeed general environment info: DeepSpeed general environment info:0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1torch install path - torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 - 11.1nvcc version - .....................nvcc version 11.2..................... - 11.2deepspeed install path - ...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -DeepSpeed general environment info: -async_io async_io............... [NO]............... [NO]....... .......[NO] -[NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -torch version .................... 1.8.1 -....... [OKAY] -torch cuda version ............... 11.1 -utils .................. utils[YES] ........................ [OKAY][YES] -nvcc version ..................... 11.2 - ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -transformer_inference .. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -async_io ............... [NO] ....... [NO] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -transformer_inference .. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -utils .................. [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -utils async_io.................. ...............[YES] [NO]...... .......[OKAY] -[NO] -quantizer .............. [NO] ....... [OKAY] ---------------------------------------------------transformer_inference - .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version ............... 11.1 -async_io ............... [NO] ....... [NO] -nvcc version ..................... 11.2 -transformer_inference .. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -utils .................. [YES] ...... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_io ............... [NO] ....... [NO] -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninja .................. ..................[OKAY] -[OKAY]-------------------------------------------------- - ---------------------------------------------------op name - ................op name installed .................. compatibleinstalled - --------------------------------------------------.. - compatible --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam...... ...............[OKAY] -[YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY]fused_adam - ............. fused_lamb[NO] ............. .......[NO] [OKAY]....... -[OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............transformer [NO]............ [NO]....... .......[OKAY] -[OKAY] -transformer stochastic_transformer............ .[NO] [NO]....... ....... [OKAY][OKAY] - -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja -JIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY] -[OKAY][OKAY] ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name op name op name................ ................ ................ ................installed installed installed ..installed .. ..compatible .. -compatiblecompatible - -op nameop name ---------------------------------------------------compatible-------------------------------------------------- - --------------------------------------------------- - --------------------------------------------------- - op name................op name ................ ................................installedinstalled .. installedinstalled.. compatiblecompatible -.... --------------------------------------------------- -compatible--------------------------------------------------compatible - - -cpu_adamcpu_adam .............................. cpu_adamcpu_adam[YES] [YES].................................... ......[YES][YES][OKAY] --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam ......cpu_adam............... cpu_adam...............[YES][OKAY] - ......[OKAY]...... - [OKAY] -[OKAY] -...............[YES]...... ......[YES][OKAY] -[OKAY]...... -fused_adam ............. fused_adam[NO] .................... fused_adam[NO] [OKAY]fused_adam ............. -....... ............. [NO] fused_lamb[NO][OKAY] ....... - fused_adam[OKAY] -............. ....... [OKAY] fused_lamb [NO] - [OKAY].................... -............. [NO] ....... fused_adam[OKAY] - [NO]fused_lamb[OKAY] ....... -.............fused_lamb [OKAY][NO]............. -............. fused_adam[NO]fused_lamb fused_adam................................. [NO].............[NO][OKAY] - .......[NO] [OKAY] -....... [OKAY] -..............[NO] [OKAY][OKAY] - fused_lamb -....... .............[OKAY] fused_lamb[NO] -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] - .................... fused_lamb [NO] sparse_attn.................... [OKAY]............ -[NO] .......sparse_attn transformer[OKAY]............ - [NO][OKAY][NO] -....... .......[OKAY] -[OKAY] - sparse_attn............[NO] transformer ............ [NO]................... [NO] .......[OKAY][NO] - [OKAY].............. -sparse_attn ............transformer [NO]............ .......[NO] sparse_attn[OKAY].......sparse_attn - transformer[OKAY][OKAY] -stochastic_transformer -............ [NO] transformer........ stochastic_transformer [NO][OKAY] ............ - ............[OKAY]transformer............ - ........[NO] [NO][OKAY]stochastic_transformer ....... - [NO] ............ [NO]stochastic_transformer....... [NO][OKAY]........ - ....... [OKAY][OKAY]. - - ....... [NO][OKAY]transformer - [OKAY]................... - [NO] .......stochastic_transformer [OKAY] -. [NO] ....... [OKAY] - transformer[OKAY] -[NO]stochastic_transformer............ ....... [NO].[OKAY] -.......[NO] .......[OKAY]stochastic_transformer - [OKAY] -. stochastic_transformer[NO] ....... .[OKAY] - [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ...............async_io [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY] -quantizer .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -nvcc version ..................... 11.2 -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -ninjaninjaninjaninja .................. .................. .................................... [OKAY][OKAY] - [OKAY] -[OKAY]-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------op name ---------------------------------------------------op name -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -op name ................ ................ op name................ installed installedinstalled.................. ..compatible .. -installed compatible compatible-------------------------------------------------- -.. --------------------------------------------------- - ---------------------------------------------------compatible - --------------------------------------------------- -cpu_adam cpu_adam............... ...............[YES] cpu_adam[YES] ...... cpu_adam............... ......[OKAY][YES] - ...............[OKAY]...... - [YES][OKAY] -...... [OKAY] -fused_adam .............fused_adam [NO]............. .......[NO] fused_adam [OKAY] fused_adam....... -............. [OKAY].............fused_lamb[NO] -[NO] ...........................fused_lamb [NO][OKAY][OKAY] ............. - - .......[NO] [OKAY]fused_lamb.......fused_lamb -............. [OKAY][NO]............. - .......[NO] [OKAY] -....... [OKAY]sparse_attn - ............ [NO] sparse_attn....... ............[OKAY] -[NO] sparse_attn.......transformer ............sparse_attn............[OKAY] -[NO]............[NO]transformer ..............[NO] ............ [OKAY][OKAY] ....... - -[NO] [OKAY]....... -transformer stochastic_transformer [OKAY]transformer ............ - .............[NO] stochastic_transformer [NO][NO] ....... ....... ....... . [OKAY][OKAY] [OKAY] - -[NO] - ....... stochastic_transformer[OKAY] -stochastic_transformer . .[NO] [NO]....... [OKAY]....... - [OKAY] ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja -JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ......quantizer [OKAY].............. - [NO] ....... [OKAY] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name op nameop name ................ ................................ ................ installed installedinstalled installed.. ......compatible compatible compatible -compatible --------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adamcpu_adamcpu_adam ............................................. [YES][YES][YES] cpu_adam ............ ...... ...............[OKAY][OKAY][OKAY] - - - [YES] ...... [OKAY] -fused_adamfused_adam fused_adam ............. ............. ............. [NO] [NO] [NO]....... ..............[OKAY] -[OKAY][OKAY] - -fused_adam fused_lambfused_lamb fused_lamb.......................... ............. ............. [NO][NO][NO] .......[NO].............. .......[OKAY][OKAY][OKAY] - -[OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -fused_lamb ............. [NO] ....... sparse_attnsparse_attnsparse_attn[OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -.................................... [NO] [NO] [NO] ....... ....... ....... [OKAY] [OKAY] -[OKAY] - --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -transformertransformer transformer .................................... [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] -sparse_attn - - stochastic_transformerstochastic_transformer stochastic_transformer .............. . [NO] [NO][NO] [NO] ....... ..................... [OKAY][OKAY][OKAY] - - -[OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] [OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op nameop name - op name ................op name................ installed................installed................ ..installed..installed compatible..compatible -.. ---------------------------------------------------compatible -------------------------------------------------- - -compatible - --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adam cpu_adam cpu_adam.............................. ..............................[YES][YES] [YES][YES]............ ...... ......[OKAY] [OKAY] - -[OKAY][OKAY] - -fused_adamfused_adam fused_adam.......................... fused_adam [NO] .............[NO] ............. .............. [NO] [NO] [OKAY][OKAY] ....... -....... - fused_lamb[OKAY]fused_lamb[OKAY] -............. -............. [NO]fused_lamb [NO]fused_lamb ....... ................................. [OKAY] [OKAY][NO] -[NO] - .............. [OKAY][OKAY] - -sparse_attn ............ [NO]sparse_attn ................... sparse_attnsparse_attn[OKAY][NO] - ............................... [NO]transformer[OKAY][NO] - ............ ..............transformer [NO] [OKAY] ............ -....... [OKAY] [NO]transformer -[OKAY] -.......transformer............ [OKAY] stochastic_transformer -............ [NO] .[NO]....... stochastic_transformer[NO]....... [OKAY]........[OKAY] - - [NO][OKAY] -stochastic_transformer....... stochastic_transformer[OKAY] -. .[NO] [NO]....... .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -utilsasync_io .................. ...............[YES] [NO]...... .......[OKAY] -[NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference ..-------------------------------------------------- -[NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................................... ....................................[OKAY] [OKAY] -[OKAY][OKAY] - --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- -op name - - op name................op nameop name ................................installed................ installed installedinstalled.. ......compatible - -------------------------------------------------- compatiblecompatiblecompatible - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -cpu_adam ...............cpu_adamcpu_adam ...............[YES]............... cpu_adam [YES]......[YES] ......[OKAY]..................... -[OKAY] [OKAY] -[YES] - fused_adam...... fused_adam.............[OKAY] -.............fused_adam[NO] [NO]............. ....... ....... [NO] [OKAY] -[OKAY]....... - fused_lamb[OKAY] fused_lamb -............. .............fused_lamb[NO] [NO]............. ....... fused_adam.......[NO][OKAY] ............. -....... [OKAY] -[NO][OKAY] - ....... [OKAY] -sparse_attn ............ fused_lamb[NO] sparse_attn.................... ............[OKAY] sparse_attn -[NO] [NO] ............transformer .............. [NO] ............ [OKAY] ....... [NO] - [OKAY] [OKAY] - -transformer.......transformer ............[OKAY]............ -[NO][NO] .......stochastic_transformer ....... [OKAY] -[OKAY]. - [NO]stochastic_transformer sparse_attn .......stochastic_transformer .[OKAY] -.[NO] ............ ....... [NO] [OKAY] -.......[NO] [OKAY]....... -[OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.quantizer -.............. [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja .................................... .................................... [OKAY][OKAY] -[OKAY] ---------------------------------------------------[OKAY] - --------------------------------------------------- - ---------------------------------------------------op name-------------------------------------------------- op name - -................ op name................ op name installed ................ installed................ .. installed.. installed .. compatible .. -compatible ---------------------------------------------------compatible --------------------------------------------------- - - --------------------------------------------------compatible - -cpu_adam ............... [YES]cpu_adam -------------------------------------------------- cpu_adam...... -............... ...............[YES][OKAY] ...... -[YES] [OKAY] -......cpu_adam [OKAY] -fused_adam ............. ...............fused_adam [NO] .............fused_adam....... [YES].............[NO] [OKAY] ....... - [NO] ......[OKAY] fused_lamb - .......[OKAY] -.............fused_lamb [NO][OKAY]............. - .......[NO] fused_lamb[OKAY]....... ............. -[OKAY]fused_adam - [NO]............. .......[NO] [OKAY] -sparse_attn ................... sparse_attn [OKAY][NO]............ sparse_attn.......[NO] - ...................[OKAY] -[OKAY]fused_lamb[NO] - transformer....... transformer ............ [OKAY] -.........................[NO] [NO]transformer....... ...................[NO][OKAY] -[NO][OKAY] -..............stochastic_transformer [OKAY]stochastic_transformer -.[OKAY] - .[NO] stochastic_transformer [NO] .............. .[OKAY][OKAY] -[NO] - ....... sparse_attn[OKAY] -............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] -.. [NO] ....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -/bin/sh: line 0: type: git: not found -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... [OKAY]quantizer - .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninjaninjaninjaninja .................. ......................................................[OKAY] -[OKAY][OKAY][OKAY] - --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] - -------------------------------------------------------------------------------------------------------------------------------------------------------op name - - -utils .................. [YES] ...... [OKAY] - ................op nameop name op name installed ................ ..................................installed installedinstalledcompatible.. -quantizer .............. [NO] ....... [OKAY] - ..compatible..-------------------------------------------------- - ---------------------------------------------------compatiblecompatible - - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -cpu_adam ............... cpu_adam[YES] ..................... [YES][OKAY]cpu_adam - cpu_adam..................... ...............[YES][OKAY] -[YES]...... ......fused_adam[OKAY] -[OKAY]............. - fused_adam[NO] .................... [NO][OKAY] -....... [OKAY] -fused_adamfused_lamb fused_lamb fused_adam............. .......................... .............[NO] [NO] [NO][NO] ....... ....... ....... [OKAY] .......[OKAY] -[OKAY] - -[OKAY] -fused_lamb .............fused_lamb [NO]............. .......[NO] sparse_attn[OKAY]sparse_attn....... - ........................[OKAY] -[NO][NO] .............. [OKAY][OKAY] - -transformer transformer............ sparse_attn............ [NO]............[NO]sparse_attn .......[NO]................... [OKAY].......[OKAY][NO] - -[OKAY]....... - stochastic_transformerstochastic_transformer[OKAY] -transformer.. transformer............[NO][NO] ..........................[NO] [OKAY] [OKAY][NO] - -....... .......[OKAY] -[OKAY] -stochastic_transformer stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... utils[OKAY] -.................. [YES] ......utils [OKAY].................. - [YES] ......quantizer [OKAY].............. - [NO] ....... quantizer[OKAY] -.............. [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - ....................torch cuda version 1.8.1............... - 11.1torch cuda version -JIT compiled ops requires ninja - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - nvcc version............... .....................11.1 -11.2nvcc version -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -........... deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed info - ...................deepspeed wheel compiled w. ...... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version ............... 11.1 -async_io ............... [NO] ....... [NO] -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer_inference .. [NO]async_io ...................... [NO][OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -....... [NO] -utils .................. [YES] transformer_inference...... .. [OKAY][NO] - ....... [OKAY] -quantizerutils ................................ [YES][NO] ............. [OKAY] -[OKAY] -quantizer .............. [NO] --------------------------------------------------....... -[OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................................................................ installedinstalledinstalledinstalled ........ compatiblecompatiblecompatiblecompatible - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -cpu_adamcpu_adamcpu_adamcpu_adam ............................................................ [YES][YES][YES][YES] ........................ [OKAY] -[OKAY][OKAY][OKAY] - - -fused_adam .............fused_adam fused_adamfused_adam [NO] ............. ............. .................... [NO] [NO] [NO][OKAY] ....... - ..............[OKAY] -[OKAY][OKAY] - -fused_lamb .............fused_lamb fused_lamb [NO]fused_lamb............. .................................[NO] [OKAY][NO][NO] - ..................... [OKAY][OKAY][OKAY] - - -DeepSpeed general environment info: -sparse_attn ............ [NO] sparse_attn.......sparse_attn ............sparse_attn[OKAY]............ -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - [NO]............ [NO] ....... [NO]transformer ....... [OKAY] ................... -[OKAY] -[OKAY][NO] -torch cuda version ............... 11.1 - transformer....... transformer transformer............ [OKAY] ........................ -[NO] [NO][NO]....... ..............[OKAY] -[OKAY][OKAY]stochastic_transformer -nvcc version ..................... 11.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - . stochastic_transformerstochastic_transformerstochastic_transformer [NO] ......... . [NO] [OKAY] [NO][NO] -....... ..............[OKAY] -[OKAY][OKAY] - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -nvcc version ..................... 11.2 -async_io ............... [NO] ....... [NO] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer_inference .. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninja ninja .................. .................................... ..................[OKAY] - [OKAY]--------------------------------------------------[OKAY][OKAY] - - - ---------------------------------------------------op name-------------------------------------------------- -------------------------------------------------- - -................ -op nameop name op name ................................installed ................installed..installed installedcompatible.. - ....compatible-------------------------------------------------- - -compatible-------------------------------------------------- - --------------------------------------------------- -compatible --------------------------------------------------- -cpu_adamcpu_adam cpu_adam ............... ............... ............... [YES] [YES] [YES] ...... ...... cpu_adam...... [OKAY] -...............[OKAY][OKAY] - -[YES] ...... fused_adam[OKAY] -.............fused_adam fused_adam [NO] ................................. [NO][OKAY][NO] - .............. fused_adam fused_lamb[OKAY] [OKAY] -............. -............. [NO][NO]fused_lamb fused_lamb ....... ............. ....... .............[NO][OKAY] -[OKAY][NO]....... .......[OKAY] - -[OKAY] -fused_lambsparse_attn ......................... [NO][NO] .......sparse_attn .......[OKAY] -sparse_attn[OKAY] ............ -............ [NO][NO]transformer .......................... [OKAY][NO][OKAY] - -....... transformersparse_attntransformer[OKAY] -............ ........................ [NO][NO]stochastic_transformer[NO] ....... .............. .[OKAY][OKAY] - -[NO][OKAY] -....... stochastic_transformer[OKAY]transformerstochastic_transformer - .. ............[NO][NO] [NO] ....... ....... .......[OKAY] [OKAY] -[OKAY] - -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -DeepSpeed general environment info:torch version .................... -1.8.1torch install path - ...............torch cuda version torch install path ............... 11.1............... - ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']nvcc version - ..................... torch version11.2 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']....................deepspeed install path - 1.8.1........... - torch version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch cuda version.................... - deepspeed info1.8.1............... - ...................11.1 torch cuda version -0.4.2+bc17042, bc17042, big-science nvcc version -............... deepspeed wheel compiled w......................11.1 -......11.2 -nvcc versiontorch 1.8, cuda 11.1deepspeed install path - ................................ 11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info - deepspeed wheel compiled w.................... ...... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................. .................. ....................................[OKAY] -[OKAY][OKAY][OKAY] --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op name - - --------------------------------------------------................op nameop name -installed................ ................op nameinstalled.. installed..compatible -..................compatible-------------------------------------------------- compatible - -installed --------------------------------------------------- ---------------------------------------------------.. -cpu_adam ...............compatible -[YES]cpu_adam-------------------------------------------------- .....................cpu_adam -............... [YES][OKAY][YES] - ............ [OKAY][OKAY] - -cpu_adam ............... [YES] ......fused_adam [OKAY].............fused_adam - fused_adam[NO]............. ....................[NO] [OKAY][NO]....... - .......[OKAY] -[OKAY]fused_lambfused_adam - fused_lamb .......................... fused_lamb [NO] .............[NO]............. [NO] .............. [NO] ....... .......[OKAY] [OKAY] - -[OKAY][OKAY] - -fused_lamb ............. [NO] ....... [OKAY] -sparse_attnsparse_attn sparse_attn ............ ........................ [NO] [NO] [NO] .......sparse_attn ....... ................... [OKAY] -[OKAY][OKAY][NO] - -transformer transformertransformer............ ....... ........................[NO][OKAY] -[NO][NO]....... ..............[OKAY] - transformer[OKAY][OKAY] - -stochastic_transformer............ stochastic_transformer stochastic_transformer [NO]. ..[NO] ....... [NO][NO] ....... [OKAY].............. - [OKAY][OKAY][OKAY] - - -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ...... torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version ............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inferenceutils .................... [NO][YES] ............. [OKAY][OKAY] - -quantizer utils.............. ..................[NO] [YES]....... ......[OKAY] -[OKAY] ---------------------------------------------------quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] - [OKAY]quantizer -async_io ............... [NO] ....... [NO] - .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed general environment info: --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -torch version .................... 1.8.1 -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -torch cuda version ............... 11.1 -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -nvcc version ..................... 11.2 -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -utils .................. [YES] ...... [OKAY] - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.quantizer .............. - [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -DeepSpeed general environment info: -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 - [YES] ......quantizer [OKAY].............. - [NO] ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... -torch cuda version ............... 11.1 - [OKAY] --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -> setting tensorboard ... -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -DeepSpeed general environment info:DeepSpeed general environment info: -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:DeepSpeed general environment info: -torch version .................... 1.8.1 - -torch cuda version ............... 11.1 -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch versiontorch version ........................................ 1.8.11.8.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY] - -[OKAY] [OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op name op name ................op name ................ installed................installed ................ ....installed installed compatible compatible -.... - -------------------------------------------------- -------------------------------------------------- - -compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... cpu_adam cpu_adam ...............[YES] ............... ...............[YES] ...... [YES] [YES] ...... [OKAY] ...... -[OKAY]...... - [OKAY][OKAY] - -fused_adam .............fused_adamfused_adam [NO] ............. fused_adam.................... .............[NO][NO][OKAY] [NO] -....... [OKAY] -..............fused_lamb fused_lamb[OKAY].............[OKAY] - -.............[NO] fused_lamb.......fused_lamb[NO] [OKAY].................... -............. [NO][OKAY] [NO] -....... .......[OKAY] -[OKAY] -sparse_attn ............ [NO] ....... sparse_attn[OKAY] - ............sparse_attntransformer sparse_attn [NO] ........................ ................... [NO] [NO] [NO] .......[OKAY] ....... - ....... [OKAY] [OKAY] -transformer[OKAY] - -............ transformerstochastic_transformer[NO] transformer ............ .................... [NO] [OKAY][NO][NO] -....... ..............[OKAY] stochastic_transformer[OKAY] - -[OKAY] -.stochastic_transformer [NO]stochastic_transformer . ....... [NO].[OKAY] -.......[NO] [OKAY]....... - [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:torch version -.................... 1.8.1 -torch cuda versiontorch install path .............................. 11.1 -nvcc version ..................... 11.2['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed install pathtorch version ............................... 1.8.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infotorch cuda version .................................. 0.4.2+bc17042, bc17042, big-science11.1 - -deepspeed wheel compiled w.nvcc version ........................... torch 1.8, cuda 11.111.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ...............DeepSpeed general environment info: - -torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']...............torch install path - ............... torch version .................... 1.8.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch cuda version torch version...............torch version 11.1........................................ - 1.8.1nvcc version1.8.1 - -.....................torch cuda version torch cuda version 11.2 ............... -............... deepspeed install path 11.1 11.1 -........... - nvcc versionnvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'].......................................... -/bin/sh: line 0: type: git: not found - deepspeed info11.211.2 - -...................deepspeed install path deepspeed install path 0.4.2+bc17042, bc17042, big-science ...................... - deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...... -deepspeed info deepspeed info torch 1.8, cuda 11.1...................................... - 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1 -torch version ....................torch cuda version 1.8.1............... - 11.1 -torch cuda version nvcc version............... .....................11.1 -11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -........... deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -...... deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.quantizer .............. - [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info: -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ............... 11.1 -torch version .................... 1.8.1 -nvcc version ..................... 11.2 -torch cuda version ............... 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -nvcc version ..................... 11.2 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch version ................................... 1.8.1 -torch cuda version ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.1 - -nvcc version ..................... torch version11.2 -....................deepspeed install path 1.8.1........... - torch cuda version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...............deepspeed info 11.1................... - 0.4.2+bc17042, bc17042, big-sciencenvcc version - .....................deepspeed wheel compiled w. 11.2...... - deepspeed install pathtorch 1.8, cuda 11.1 -........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1 -torch cuda version ...............torch cuda version 11.1............... - nvcc version11.1 -.....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch version ................................... 1.8.1 -torch cuda version ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -11.1 -nvcc versiontorch version ......................................... 11.21.8.1 - -deepspeed install path torch cuda version........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - -deepspeed infonvcc version ........................................ 0.4.2+bc17042, bc17042, big-science11.2 - -deepspeed wheel compiled w.deepspeed install path ................. torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -async_ioasync_io async_io.............................. [NO]............... [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [NO] [NO] ....... ....... ....... [NO] - -[NO][NO] - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inferencetransformer_inference .... transformer_inference[NO][NO] ................async_io [OKAY][OKAY][NO]............... - - [NO]....... .......[OKAY] utils -[NO] utils - .................................... [YES][YES] utils............ ..................[OKAY][OKAY] - -[YES] ...... transformer_inference[OKAY] quantizer -.. quantizer .............. [NO] ..............[NO]quantizer....... [NO].....................[OKAY] ....... -[OKAY][NO] - [OKAY] -....... [OKAY] -utils---------------------------------------------------------------------------------------------------- - -..................-------------------------------------------------- -[YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch cuda version ............... 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.async_io -............... [NO] ....... [NO] -async_iotransformer_inference ................. [NO][NO] .............. [NO][OKAY] - -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -quantizer .............. [NO] utils....... ..................[OKAY] -[YES] ......-------------------------------------------------- -[OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path DeepSpeed general environment info:........... -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ...................torch install path 0.4.2+bc17042, bc17042, big-science -............... deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version ....................torch install path 1.8.1............... - torch cuda version ............... 11.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']nvcc version - ..................... 11.2torch version - deepspeed install path.................... ...........1.8.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch cuda version - deepspeed info............... ...................11.1 -nvcc version0.4.2+bc17042, bc17042, big-science -.....................deepspeed wheel compiled w. 11.2...... - deepspeed install pathtorch 1.8, cuda 11.1 -........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY] -utils .................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version ............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO]async_io ...................... [NO] -[NO] ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................................... .................................... [OKAY] [OKAY] -[OKAY][OKAY] - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -op nameop name op nameop name ................ ................................ ................ installedinstalledinstalledinstalled ........ compatiblecompatiblecompatiblecompatible - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -cpu_adamcpu_adam cpu_adamcpu_adam .............................. ............... ............... [YES][YES] [YES] [YES] ............ ............ [OKAY][OKAY][OKAY][OKAY] - - - -fused_adamfused_adamfused_adam fused_adam............. ............. ............. .............[NO][NO] [NO][NO].............. ..............[OKAY][OKAY] - -[OKAY][OKAY] - -fused_lambfused_lambfused_lamb ..........................fused_lamb............. [NO] [NO] [NO]............. ....... ....... .......[NO] [OKAY] [OKAY][OKAY] -....... - - [OKAY] -sparse_attnsparse_attnsparse_attn .................................... [NO]sparse_attn[NO][NO] ....... ............ .............. [OKAY] [NO] -[OKAY][OKAY] - -....... transformer[OKAY] transformer -transformer............ ........................[NO] [NO][NO].......transformer ..............[OKAY]............ - [OKAY][OKAY][NO] - - ....... stochastic_transformer[OKAY] -stochastic_transformerstochastic_transformer . [NO].. stochastic_transformer .......[NO] [NO] [OKAY]........ - .......[NO][OKAY] -[OKAY]....... - [OKAY] -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path ............... - torch install pathDeepSpeed general environment info: ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - - torch version ....................torch install path 1.8.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -............... torch cuda versiontorch version ................................... 11.11.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version - torch cuda version..................... torch version ............... 11.2 .................... -11.1 -deepspeed install path1.8.1nvcc version - ................................ torch cuda version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.2............... - - deepspeed info11.1deepspeed install path - ..............................nvcc version 0.4.2+bc17042, bc17042, big-science..................... -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] deepspeed wheel compiled w.11.2 - -deepspeed info......deepspeed install path ...................torch 1.8, cuda 11.1........... - 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - deepspeed info...... ...................torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report --------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name - op name-------------------------------------------------- -................ op name op name................installed .................................. installed installed installed ..compatible.. - ..compatible--------------------------------------------------compatible - - -compatible---------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam cpu_adam...... cpu_adam.............................. [OKAY][YES][YES]............... - ............ [YES] [OKAY] [OKAY] -...... - fused_adam[OKAY] -............. [NO] ....... [OKAY]fused_adam -fused_adam .......................... fused_lamb[NO][NO] fused_adam .................... ....... [OKAY]............. - [NO] [OKAY] fused_lamb [NO] -.................... [OKAY]fused_lamb[NO] - ........................... [NO][OKAY] -[OKAY] ....... - [OKAY] -fused_lambsparse_attn ......................... [NO]sparse_attn [NO] ....... ............ .......[OKAY]sparse_attn[NO] - [OKAY].......transformer............ - [OKAY] ............[NO] - [NO]....... transformer ....... [OKAY] ............ -[OKAY] -[NO]transformer ................... sparse_attn[OKAY][NO]stochastic_transformer - ...................stochastic_transformer . [OKAY] -[NO].[NO] .......stochastic_transformer [OKAY][NO] - ............... [OKAY][OKAY][NO] - - ....... [OKAY]transformer - ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path DeepSpeed general environment info:...............DeepSpeed general environment info: DeepSpeed general environment info: - - -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch install pathtorch install path -torch install path ..............................torch version ............... .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch cuda version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - -............... 11.1torch versiontorch versiontorch version - ....................nvcc version........................................ 1.8.1.....................1.8.11.8.1 - - -11.2torch cuda version -torch cuda versiontorch cuda version deepspeed install path............... ............... ............... ...........11.1 11.1 -11.1 -nvcc version -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']nvcc version nvcc version ..................... - ..................... deepspeed info..................... 11.2 11.211.2 -................... - -deepspeed install path deepspeed install pathdeepspeed install path 0.4.2+bc17042, bc17042, big-science................................. - deepspeed wheel compiled w. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...... - - - deepspeed infodeepspeed infodeepspeed infotorch 1.8, cuda 11.1 -......................................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. deepspeed wheel compiled w. ...... ...... ...... torch 1.8, cuda 11.1 torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY] -[OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op nameop name - op nameop name ................ ................................ ................ installed installed installedinstalled .. ....compatible.. -compatiblecompatible--------------------------------------------------compatible - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... [YES]cpu_adamcpu_adam cpu_adam ...... .............................. ............... [YES][OKAY] [YES] -......[YES] [OKAY]............ - [OKAY][OKAY] - -fused_adam ............. [NO] ....... fused_adam[OKAY] -.............fused_adamfused_adam [NO]fused_lamb.......................... ....................[NO][NO] [OKAY].............. -[NO] [OKAY][OKAY]....... -fused_lamb - [OKAY]............. - fused_lambfused_lamb[NO] ................................. [NO][OKAY][NO] -....... sparse_attn ....... [OKAY]............ - [OKAY][NO] -....... [OKAY] -sparse_attn ............transformer [NO]............ sparse_attn[NO]....... sparse_attn............[OKAY]....... -............[OKAY][NO] - transformer [NO] .......stochastic_transformer............ [NO] [OKAY] ....... -........ [OKAY]transformer[OKAY][NO] - - ................... transformer [NO] [OKAY] ............stochastic_transformer -....... [NO][OKAY] ........ - [NO][OKAY] -.......stochastic_transformer [OKAY]stochastic_transformer - . [NO]. .......[NO] [OKAY]....... - [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY] [OKAY] - -[OKAY] - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op nameop name - op name................op name................ installed................................installed installed....installed compatible..compatible.. - - --------------------------------------------------compatiblecompatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] cpu_adam..................... cpu_adam...............[YES][OKAY] -.....................[YES] [OKAY][YES]...... - ......[OKAY] -[OKAY]fused_adam - ............. [NO] ....... [OKAY]fused_adam - ............. [NO]fused_adamfused_lamb ....... fused_adam .......................... [OKAY] ............. - [NO] [NO] [NO].............. fused_lamb ....... [OKAY][OKAY] ............. - -[OKAY] -[NO] fused_lamb....... .............fused_lamb [OKAY] [NO] -............. .......sparse_attn [NO] [OKAY] ................... - [NO][OKAY] -.......sparse_attn [OKAY]............ - [NO] transformer....... ............[OKAY] -[NO] .......transformer sparse_attnsparse_attn............[OKAY] - ............[NO] ............[NO]stochastic_transformer ....... [NO] ....... [OKAY] ........ -[OKAY] -[OKAY][NO] - stochastic_transformertransformer....... transformer............[OKAY]. - ............[NO] [NO] [NO] ....... ..............[OKAY] - [OKAY] -[OKAY] -stochastic_transformer . stochastic_transformer[NO] ....... .[OKAY] -[NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathDeepSpeed general environment info: torch install path............... ............... - torch install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version torch version.................... ....................1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda version ...............torch cuda versiontorch version 11.1................................... - nvcc version11.11.8.1 - -.....................nvcc version 11.2torch cuda version..................... - deepspeed install path...............11.2 - ...........deepspeed install path 11.1 -...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']nvcc version - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'].....................deepspeed info - 11.2...................deepspeed info - deepspeed install path0.4.2+bc17042, bc17042, big-science................... - ...........0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. -torch 1.8, cuda 11.1......deepspeed info - torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... [OKAY]quantizer - .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: -DeepSpeed general environment info: - -torch install pathtorch install path ..............................torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version -torch version ........................................torch version 1.8.11.8.1.................... - - 1.8.1torch cuda versiontorch cuda version - .............................. torch cuda version 11.1 11.1 -............... - nvcc versionnvcc version11.1 -.......................................... nvcc version11.211.2 - -.....................deepspeed install pathdeepspeed install path 11.2...................... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - - deepspeed infodeepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...................................... - deepspeed info0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w....................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science............ - deepspeed wheel compiled w.torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja-------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................................... ..................[OKAY] .................. [OKAY] - [OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name -op name - op name................................op name ................installedinstalled................ installed installed.... compatible....compatible - - --------------------------------------------------compatible-------------------------------------------------- - - -compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam .............................. [YES][YES]cpu_adam ...... ......cpu_adam............... [OKAY]...............[OKAY][YES] - - [YES]...... ......[OKAY] - [OKAY] -fused_adamfused_adam .......................... [NO][NO] .......fused_adam....... [OKAY].............[OKAY] - -fused_adam [NO]fused_lamb.............fused_lamb .......[NO].......................... [OKAY].......[NO][NO] - [OKAY]..............fused_lamb - [OKAY][OKAY]............. - -fused_lamb [NO] .................... [NO][OKAY] -....... [OKAY]sparse_attn - sparse_attn............ ............[NO] .......[NO] [OKAY]....... - sparse_attn[OKAY] transformer -............ ............transformersparse_attn[NO] [NO]............................... .......[NO][OKAY] [NO] - .......[OKAY]....... -transformer[OKAY][OKAY] - -............stochastic_transformer transformerstochastic_transformer [NO] ............. . .......[NO][NO] [OKAY][NO].............. - [OKAY].......[OKAY] - -[OKAY] -stochastic_transformer stochastic_transformer. [NO]. .......[NO] [OKAY] -....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------op name -op name - op name................................op name installedinstalled................................ ..installed.. compatibleinstalledcompatible.. - - --------------------------------------------------..--------------------------------------------------compatible - - -compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam .............................. cpu_adam[YES][YES] cpu_adam........................... [YES][OKAY]............... -[OKAY] -......[YES] [OKAY]...... - [OKAY] -fused_adam ............. [NO] .......fused_adam [OKAY].............fused_adam - fused_adam [NO] ............. fused_lamb.................... .............[NO][NO][OKAY] - .......[NO]....... fused_lamb[OKAY].......[OKAY] - -.............[OKAY] -fused_lamb[NO] fused_lamb ................................. [NO][NO][OKAY] -..............sparse_attn [OKAY][OKAY]............ - - [NO] ....... [OKAY] -sparse_attntransformer ........................ [NO][NO] sparse_attn .......sparse_attn ....... ............ [OKAY] -............[NO][OKAY] -[NO]transformer....... ...................[OKAY]stochastic_transformer -[NO][OKAY] . - .......transformer[NO] [OKAY]transformer....... -............ ............[OKAY]stochastic_transformer[NO] - [NO]....... . ....... [OKAY] [NO] -[OKAY] -.......stochastic_transformer [OKAY] stochastic_transformer - . .[NO] [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info:DeepSpeed general environment info: - - -torch install pathtorch install pathtorch install path ............................................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - -torch version torch versiontorch version.................... ........................................1.8.1 -1.8.11.8.1 - -torch cuda version torch cuda version...............torch cuda version 11.1.............................. - nvcc version11.111.1 - -.....................nvcc versionnvcc version 11.2.......................................... - deepspeed install path11.211.2 - -...........deepspeed install pathdeepspeed install path ......................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info...................deepspeed info ...................0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.......deepspeed wheel compiled w. ......torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................................... [OKAY][OKAY][OKAY] - - -[OKAY]------------------------------------------------------------------------------------------------------------------------------------------------------ - - - -op nameop nameop name-------------------------------------------------- -................................................ op nameinstalledinstalledinstalled .................... .. compatible -compatibleinstalledcompatible-------------------------------------------------- - - -..---------------------------------------------------------------------------------------------------- - - compatible --------------------------------------------------- -cpu_adam ............... [YES] cpu_adam......cpu_adam ...............[OKAY]cpu_adam -............... [YES]............... [YES] ...... [YES] ...... [OKAY] ......fused_adam -[OKAY] -.............[OKAY] -[NO] ....... [OKAY] -fused_adamfused_lamb fused_adam ............. ............. ............. fused_adam[NO] [NO] [NO] .................... ....... ....... [OKAY] [NO] -[OKAY][OKAY] - -....... fused_lamb[OKAY]fused_lamb - .......................... [NO][NO] fused_lamb....... .......sparse_attn.............[OKAY] [OKAY] -............[NO] - [NO]....... ....... [OKAY][OKAY] - -transformersparse_attnsparse_attn .................................... [NO][NO][NO] ..................... [OKAY][OKAY][OKAY]sparse_attn - - - ............ transformerstochastic_transformer [NO]transformer ............ .................... [NO] [NO][OKAY]....... [NO] - ....... ....... [OKAY] [OKAY] -transformer[OKAY] - - ............ stochastic_transformer[NO] stochastic_transformer ......... [OKAY][NO][NO] - .............. [OKAY][OKAY] -stochastic_transformer - . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> initializing tensor model parallel with size 4 -> initializing pipeline model parallel with size 8 -> setting random seeds to 42 ... -[2021-09-27 03:54:33,898] [INFO] [checkpointing.py:226:model_parallel_cuda_manual_seed] > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 2760 and data parallel seed: 42 -> compiling dataset index builder ... -make: Entering directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' -make: Nothing to be done for 'default'. -make: Leaving directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' ->>> done with dataset index builder. Compilation time: 0.303 seconds -> compiling and loading fused kernels ... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_upper_triang_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_upper_triang_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module fused_mix_prec_layer_norm_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module fused_mix_prec_layer_norm_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( ->>> done with compiling and loading fused kernels. Compilation time: 20.521 seconds -time to initialize megatron (seconds): -17.161 -[after megatron is initialized] datetime: 2021-09-27 03:54:54 -building GPT model ... -[2021-09-27 03:54:54,901] [INFO] [utils.py:680:see_memory_usage] Before Building Model -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:373: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved - warnings.warn( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:381: FutureWarning: torch.cuda.max_memory_cached has been renamed to torch.cuda.max_memory_reserved - warnings.warn( -[2021-09-27 03:54:54,903] [INFO] [utils.py:681:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB -[2021-09-27 03:54:54,903] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 36.85 GB, percent = 19.7% -SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None -Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=0, data=8, model=0): 32, ProcessCoord(pipe=0, data=8, model=1): 33, ProcessCoord(pipe=0, data=8, model=2): 34, ProcessCoord(pipe=0, data=8, model=3): 35, ProcessCoord(pipe=0, data=9, model=0): 36, ProcessCoord(pipe=0, data=9, model=1): 37, ProcessCoord(pipe=0, data=9, model=2): 38, ProcessCoord(pipe=0, data=9, model=3): 39, ProcessCoord(pipe=0, data=10, model=0): 40, ProcessCoord(pipe=0, data=10, model=1): 41, ProcessCoord(pipe=0, data=10, model=2): 42, ProcessCoord(pipe=0, data=10, model=3): 43, ProcessCoord(pipe=0, data=11, model=0): 44, ProcessCoord(pipe=0, data=11, model=1): 45, ProcessCoord(pipe=0, data=11, model=2): 46, ProcessCoord(pipe=0, data=11, model=3): 47, ProcessCoord(pipe=0, data=12, model=0): 48, ProcessCoord(pipe=0, data=12, model=1): 49, ProcessCoord(pipe=0, data=12, model=2): 50, ProcessCoord(pipe=0, data=12, model=3): 51, ProcessCoord(pipe=0, data=13, model=0): 52, ProcessCoord(pipe=0, data=13, model=1): 53, ProcessCoord(pipe=0, data=13, model=2): 54, ProcessCoord(pipe=0, data=13, model=3): 55, ProcessCoord(pipe=0, data=14, model=0): 56, ProcessCoord(pipe=0, data=14, model=1): 57, ProcessCoord(pipe=0, data=14, model=2): 58, ProcessCoord(pipe=0, data=14, model=3): 59, ProcessCoord(pipe=0, data=15, model=0): 60, ProcessCoord(pipe=0, data=15, model=1): 61, ProcessCoord(pipe=0, data=15, model=2): 62, ProcessCoord(pipe=0, data=15, model=3): 63, ProcessCoord(pipe=1, data=0, model=0): 64, ProcessCoord(pipe=1, data=0, model=1): 65, ProcessCoord(pipe=1, data=0, model=2): 66, ProcessCoord(pipe=1, data=0, model=3): 67, ProcessCoord(pipe=1, data=1, model=0): 68, ProcessCoord(pipe=1, data=1, model=1): 69, ProcessCoord(pipe=1, data=1, model=2): 70, ProcessCoord(pipe=1, data=1, model=3): 71, ProcessCoord(pipe=1, data=2, model=0): 72, ProcessCoord(pipe=1, data=2, model=1): 73, ProcessCoord(pipe=1, data=2, model=2): 74, ProcessCoord(pipe=1, data=2, model=3): 75, ProcessCoord(pipe=1, data=3, model=0): 76, ProcessCoord(pipe=1, data=3, model=1): 77, ProcessCoord(pipe=1, data=3, model=2): 78, ProcessCoord(pipe=1, data=3, model=3): 79, ProcessCoord(pipe=1, data=4, model=0): 80, ProcessCoord(pipe=1, data=4, model=1): 81, ProcessCoord(pipe=1, data=4, model=2): 82, ProcessCoord(pipe=1, data=4, model=3): 83, ProcessCoord(pipe=1, data=5, model=0): 84, ProcessCoord(pipe=1, data=5, model=1): 85, ProcessCoord(pipe=1, data=5, model=2): 86, ProcessCoord(pipe=1, data=5, model=3): 87, ProcessCoord(pipe=1, data=6, model=0): 88, ProcessCoord(pipe=1, data=6, model=1): 89, ProcessCoord(pipe=1, data=6, model=2): 90, ProcessCoord(pipe=1, data=6, model=3): 91, ProcessCoord(pipe=1, data=7, model=0): 92, ProcessCoord(pipe=1, data=7, model=1): 93, ProcessCoord(pipe=1, data=7, model=2): 94, ProcessCoord(pipe=1, data=7, model=3): 95, ProcessCoord(pipe=1, data=8, model=0): 96, ProcessCoord(pipe=1, data=8, model=1): 97, ProcessCoord(pipe=1, data=8, model=2): 98, ProcessCoord(pipe=1, data=8, model=3): 99, ProcessCoord(pipe=1, data=9, model=0): 100, ProcessCoord(pipe=1, data=9, model=1): 101, ProcessCoord(pipe=1, data=9, model=2): 102, ProcessCoord(pipe=1, data=9, model=3): 103, ProcessCoord(pipe=1, data=10, model=0): 104, ProcessCoord(pipe=1, data=10, model=1): 105, ProcessCoord(pipe=1, data=10, model=2): 106, ProcessCoord(pipe=1, data=10, model=3): 107, ProcessCoord(pipe=1, data=11, model=0): 108, ProcessCoord(pipe=1, data=11, model=1): 109, ProcessCoord(pipe=1, data=11, model=2): 110, ProcessCoord(pipe=1, data=11, model=3): 111, ProcessCoord(pipe=1, data=12, model=0): 112, ProcessCoord(pipe=1, data=12, model=1): 113, ProcessCoord(pipe=1, data=12, model=2): 114, ProcessCoord(pipe=1, data=12, model=3): 115, ProcessCoord(pipe=1, data=13, model=0): 116, ProcessCoord(pipe=1, data=13, model=1): 117, ProcessCoord(pipe=1, data=13, model=2): 118, ProcessCoord(pipe=1, data=13, model=3): 119, ProcessCoord(pipe=1, data=14, model=0): 120, ProcessCoord(pipe=1, data=14, model=1): 121, ProcessCoord(pipe=1, data=14, model=2): 122, ProcessCoord(pipe=1, data=14, model=3): 123, ProcessCoord(pipe=1, data=15, model=0): 124, ProcessCoord(pipe=1, data=15, model=1): 125, ProcessCoord(pipe=1, data=15, model=2): 126, ProcessCoord(pipe=1, data=15, model=3): 127, ProcessCoord(pipe=2, data=0, model=0): 128, ProcessCoord(pipe=2, data=0, model=1): 129, ProcessCoord(pipe=2, data=0, model=2): 130, ProcessCoord(pipe=2, data=0, model=3): 131, ProcessCoord(pipe=2, data=1, model=0): 132, ProcessCoord(pipe=2, data=1, model=1): 133, ProcessCoord(pipe=2, data=1, model=2): 134, ProcessCoord(pipe=2, data=1, model=3): 135, ProcessCoord(pipe=2, data=2, model=0): 136, ProcessCoord(pipe=2, data=2, model=1): 137, ProcessCoord(pipe=2, data=2, model=2): 138, ProcessCoord(pipe=2, data=2, model=3): 139, ProcessCoord(pipe=2, data=3, model=0): 140, ProcessCoord(pipe=2, data=3, model=1): 141, ProcessCoord(pipe=2, data=3, model=2): 142, ProcessCoord(pipe=2, data=3, model=3): 143, ProcessCoord(pipe=2, data=4, model=0): 144, ProcessCoord(pipe=2, data=4, model=1): 145, ProcessCoord(pipe=2, data=4, model=2): 146, ProcessCoord(pipe=2, data=4, model=3): 147, ProcessCoord(pipe=2, data=5, model=0): 148, ProcessCoord(pipe=2, data=5, model=1): 149, ProcessCoord(pipe=2, data=5, model=2): 150, ProcessCoord(pipe=2, data=5, model=3): 151, ProcessCoord(pipe=2, data=6, model=0): 152, ProcessCoord(pipe=2, data=6, model=1): 153, ProcessCoord(pipe=2, data=6, model=2): 154, ProcessCoord(pipe=2, data=6, model=3): 155, ProcessCoord(pipe=2, data=7, model=0): 156, ProcessCoord(pipe=2, data=7, model=1): 157, ProcessCoord(pipe=2, data=7, model=2): 158, ProcessCoord(pipe=2, data=7, model=3): 159, ProcessCoord(pipe=2, data=8, model=0): 160, ProcessCoord(pipe=2, data=8, model=1): 161, ProcessCoord(pipe=2, data=8, model=2): 162, ProcessCoord(pipe=2, data=8, model=3): 163, ProcessCoord(pipe=2, data=9, model=0): 164, ProcessCoord(pipe=2, data=9, model=1): 165, ProcessCoord(pipe=2, data=9, model=2): 166, ProcessCoord(pipe=2, data=9, model=3): 167, ProcessCoord(pipe=2, data=10, model=0): 168, ProcessCoord(pipe=2, data=10, model=1): 169, ProcessCoord(pipe=2, data=10, model=2): 170, ProcessCoord(pipe=2, data=10, model=3): 171, ProcessCoord(pipe=2, data=11, model=0): 172, ProcessCoord(pipe=2, data=11, model=1): 173, ProcessCoord(pipe=2, data=11, model=2): 174, ProcessCoord(pipe=2, data=11, model=3): 175, ProcessCoord(pipe=2, data=12, model=0): 176, ProcessCoord(pipe=2, data=12, model=1): 177, ProcessCoord(pipe=2, data=12, model=2): 178, ProcessCoord(pipe=2, data=12, model=3): 179, ProcessCoord(pipe=2, data=13, model=0): 180, ProcessCoord(pipe=2, data=13, model=1): 181, ProcessCoord(pipe=2, data=13, model=2): 182, ProcessCoord(pipe=2, data=13, model=3): 183, ProcessCoord(pipe=2, data=14, model=0): 184, ProcessCoord(pipe=2, data=14, model=1): 185, ProcessCoord(pipe=2, data=14, model=2): 186, ProcessCoord(pipe=2, data=14, model=3): 187, ProcessCoord(pipe=2, data=15, model=0): 188, ProcessCoord(pipe=2, data=15, model=1): 189, ProcessCoord(pipe=2, data=15, model=2): 190, ProcessCoord(pipe=2, data=15, model=3): 191, ProcessCoord(pipe=3, data=0, model=0): 192, ProcessCoord(pipe=3, data=0, model=1): 193, ProcessCoord(pipe=3, data=0, model=2): 194, ProcessCoord(pipe=3, data=0, model=3): 195, ProcessCoord(pipe=3, data=1, model=0): 196, ProcessCoord(pipe=3, data=1, model=1): 197, ProcessCoord(pipe=3, data=1, model=2): 198, ProcessCoord(pipe=3, data=1, model=3): 199, ProcessCoord(pipe=3, data=2, model=0): 200, ProcessCoord(pipe=3, data=2, model=1): 201, ProcessCoord(pipe=3, data=2, model=2): 202, ProcessCoord(pipe=3, data=2, model=3): 203, ProcessCoord(pipe=3, data=3, model=0): 204, ProcessCoord(pipe=3, data=3, model=1): 205, ProcessCoord(pipe=3, data=3, model=2): 206, ProcessCoord(pipe=3, data=3, model=3): 207, ProcessCoord(pipe=3, data=4, model=0): 208, ProcessCoord(pipe=3, data=4, model=1): 209, ProcessCoord(pipe=3, data=4, model=2): 210, ProcessCoord(pipe=3, data=4, model=3): 211, ProcessCoord(pipe=3, data=5, model=0): 212, ProcessCoord(pipe=3, data=5, model=1): 213, ProcessCoord(pipe=3, data=5, model=2): 214, ProcessCoord(pipe=3, data=5, model=3): 215, ProcessCoord(pipe=3, data=6, model=0): 216, ProcessCoord(pipe=3, data=6, model=1): 217, ProcessCoord(pipe=3, data=6, model=2): 218, ProcessCoord(pipe=3, data=6, model=3): 219, ProcessCoord(pipe=3, data=7, model=0): 220, ProcessCoord(pipe=3, data=7, model=1): 221, ProcessCoord(pipe=3, data=7, model=2): 222, ProcessCoord(pipe=3, data=7, model=3): 223, ProcessCoord(pipe=3, data=8, model=0): 224, ProcessCoord(pipe=3, data=8, model=1): 225, ProcessCoord(pipe=3, data=8, model=2): 226, ProcessCoord(pipe=3, data=8, model=3): 227, ProcessCoord(pipe=3, data=9, model=0): 228, ProcessCoord(pipe=3, data=9, model=1): 229, ProcessCoord(pipe=3, data=9, model=2): 230, ProcessCoord(pipe=3, data=9, model=3): 231, ProcessCoord(pipe=3, data=10, model=0): 232, ProcessCoord(pipe=3, data=10, model=1): 233, ProcessCoord(pipe=3, data=10, model=2): 234, ProcessCoord(pipe=3, data=10, model=3): 235, ProcessCoord(pipe=3, data=11, model=0): 236, ProcessCoord(pipe=3, data=11, model=1): 237, ProcessCoord(pipe=3, data=11, model=2): 238, ProcessCoord(pipe=3, data=11, model=3): 239, ProcessCoord(pipe=3, data=12, model=0): 240, ProcessCoord(pipe=3, data=12, model=1): 241, ProcessCoord(pipe=3, data=12, model=2): 242, ProcessCoord(pipe=3, data=12, model=3): 243, ProcessCoord(pipe=3, data=13, model=0): 244, ProcessCoord(pipe=3, data=13, model=1): 245, ProcessCoord(pipe=3, data=13, model=2): 246, ProcessCoord(pipe=3, data=13, model=3): 247, ProcessCoord(pipe=3, data=14, model=0): 248, ProcessCoord(pipe=3, data=14, model=1): 249, ProcessCoord(pipe=3, data=14, model=2): 250, ProcessCoord(pipe=3, data=14, model=3): 251, ProcessCoord(pipe=3, data=15, model=0): 252, ProcessCoord(pipe=3, data=15, model=1): 253, ProcessCoord(pipe=3, data=15, model=2): 254, ProcessCoord(pipe=3, data=15, model=3): 255, ProcessCoord(pipe=4, data=0, model=0): 256, ProcessCoord(pipe=4, data=0, model=1): 257, ProcessCoord(pipe=4, data=0, model=2): 258, ProcessCoord(pipe=4, data=0, model=3): 259, ProcessCoord(pipe=4, data=1, model=0): 260, ProcessCoord(pipe=4, data=1, model=1): 261, ProcessCoord(pipe=4, data=1, model=2): 262, ProcessCoord(pipe=4, data=1, model=3): 263, ProcessCoord(pipe=4, data=2, model=0): 264, ProcessCoord(pipe=4, data=2, model=1): 265, ProcessCoord(pipe=4, data=2, model=2): 266, ProcessCoord(pipe=4, data=2, model=3): 267, ProcessCoord(pipe=4, data=3, model=0): 268, ProcessCoord(pipe=4, data=3, model=1): 269, ProcessCoord(pipe=4, data=3, model=2): 270, ProcessCoord(pipe=4, data=3, model=3): 271, ProcessCoord(pipe=4, data=4, model=0): 272, ProcessCoord(pipe=4, data=4, model=1): 273, ProcessCoord(pipe=4, data=4, model=2): 274, ProcessCoord(pipe=4, data=4, model=3): 275, ProcessCoord(pipe=4, data=5, model=0): 276, ProcessCoord(pipe=4, data=5, model=1): 277, ProcessCoord(pipe=4, data=5, model=2): 278, ProcessCoord(pipe=4, data=5, model=3): 279, ProcessCoord(pipe=4, data=6, model=0): 280, ProcessCoord(pipe=4, data=6, model=1): 281, ProcessCoord(pipe=4, data=6, model=2): 282, ProcessCoord(pipe=4, data=6, model=3): 283, ProcessCoord(pipe=4, data=7, model=0): 284, ProcessCoord(pipe=4, data=7, model=1): 285, ProcessCoord(pipe=4, data=7, model=2): 286, ProcessCoord(pipe=4, data=7, model=3): 287, ProcessCoord(pipe=4, data=8, model=0): 288, ProcessCoord(pipe=4, data=8, model=1): 289, ProcessCoord(pipe=4, data=8, model=2): 290, ProcessCoord(pipe=4, data=8, model=3): 291, ProcessCoord(pipe=4, data=9, model=0): 292, ProcessCoord(pipe=4, data=9, model=1): 293, ProcessCoord(pipe=4, data=9, model=2): 294, ProcessCoord(pipe=4, data=9, model=3): 295, ProcessCoord(pipe=4, data=10, model=0): 296, ProcessCoord(pipe=4, data=10, model=1): 297, ProcessCoord(pipe=4, data=10, model=2): 298, ProcessCoord(pipe=4, data=10, model=3): 299, ProcessCoord(pipe=4, data=11, model=0): 300, ProcessCoord(pipe=4, data=11, model=1): 301, ProcessCoord(pipe=4, data=11, model=2): 302, ProcessCoord(pipe=4, data=11, model=3): 303, ProcessCoord(pipe=4, data=12, model=0): 304, ProcessCoord(pipe=4, data=12, model=1): 305, ProcessCoord(pipe=4, data=12, model=2): 306, ProcessCoord(pipe=4, data=12, model=3): 307, ProcessCoord(pipe=4, data=13, model=0): 308, ProcessCoord(pipe=4, data=13, model=1): 309, ProcessCoord(pipe=4, data=13, model=2): 310, ProcessCoord(pipe=4, data=13, model=3): 311, ProcessCoord(pipe=4, data=14, model=0): 312, ProcessCoord(pipe=4, data=14, model=1): 313, ProcessCoord(pipe=4, data=14, model=2): 314, ProcessCoord(pipe=4, data=14, model=3): 315, ProcessCoord(pipe=4, data=15, model=0): 316, ProcessCoord(pipe=4, data=15, model=1): 317, ProcessCoord(pipe=4, data=15, model=2): 318, ProcessCoord(pipe=4, data=15, model=3): 319, ProcessCoord(pipe=5, data=0, model=0): 320, ProcessCoord(pipe=5, data=0, model=1): 321, ProcessCoord(pipe=5, data=0, model=2): 322, ProcessCoord(pipe=5, data=0, model=3): 323, ProcessCoord(pipe=5, data=1, model=0): 324, ProcessCoord(pipe=5, data=1, model=1): 325, ProcessCoord(pipe=5, data=1, model=2): 326, ProcessCoord(pipe=5, data=1, model=3): 327, ProcessCoord(pipe=5, data=2, model=0): 328, ProcessCoord(pipe=5, data=2, model=1): 329, ProcessCoord(pipe=5, data=2, model=2): 330, ProcessCoord(pipe=5, data=2, model=3): 331, ProcessCoord(pipe=5, data=3, model=0): 332, ProcessCoord(pipe=5, data=3, model=1): 333, ProcessCoord(pipe=5, data=3, model=2): 334, ProcessCoord(pipe=5, data=3, model=3): 335, ProcessCoord(pipe=5, data=4, model=0): 336, ProcessCoord(pipe=5, data=4, model=1): 337, ProcessCoord(pipe=5, data=4, model=2): 338, ProcessCoord(pipe=5, data=4, model=3): 339, ProcessCoord(pipe=5, data=5, model=0): 340, ProcessCoord(pipe=5, data=5, model=1): 341, ProcessCoord(pipe=5, data=5, model=2): 342, ProcessCoord(pipe=5, data=5, model=3): 343, ProcessCoord(pipe=5, data=6, model=0): 344, ProcessCoord(pipe=5, data=6, model=1): 345, ProcessCoord(pipe=5, data=6, model=2): 346, ProcessCoord(pipe=5, data=6, model=3): 347, ProcessCoord(pipe=5, data=7, model=0): 348, ProcessCoord(pipe=5, data=7, model=1): 349, ProcessCoord(pipe=5, data=7, model=2): 350, ProcessCoord(pipe=5, data=7, model=3): 351, ProcessCoord(pipe=5, data=8, model=0): 352, ProcessCoord(pipe=5, data=8, model=1): 353, ProcessCoord(pipe=5, data=8, model=2): 354, ProcessCoord(pipe=5, data=8, model=3): 355, ProcessCoord(pipe=5, data=9, model=0): 356, ProcessCoord(pipe=5, data=9, model=1): 357, ProcessCoord(pipe=5, data=9, model=2): 358, ProcessCoord(pipe=5, data=9, model=3): 359, ProcessCoord(pipe=5, data=10, model=0): 360, ProcessCoord(pipe=5, data=10, model=1): 361, ProcessCoord(pipe=5, data=10, model=2): 362, ProcessCoord(pipe=5, data=10, model=3): 363, ProcessCoord(pipe=5, data=11, model=0): 364, ProcessCoord(pipe=5, data=11, model=1): 365, ProcessCoord(pipe=5, data=11, model=2): 366, ProcessCoord(pipe=5, data=11, model=3): 367, ProcessCoord(pipe=5, data=12, model=0): 368, ProcessCoord(pipe=5, data=12, model=1): 369, ProcessCoord(pipe=5, data=12, model=2): 370, ProcessCoord(pipe=5, data=12, model=3): 371, ProcessCoord(pipe=5, data=13, model=0): 372, ProcessCoord(pipe=5, data=13, model=1): 373, ProcessCoord(pipe=5, data=13, model=2): 374, ProcessCoord(pipe=5, data=13, model=3): 375, ProcessCoord(pipe=5, data=14, model=0): 376, ProcessCoord(pipe=5, data=14, model=1): 377, ProcessCoord(pipe=5, data=14, model=2): 378, ProcessCoord(pipe=5, data=14, model=3): 379, ProcessCoord(pipe=5, data=15, model=0): 380, ProcessCoord(pipe=5, data=15, model=1): 381, ProcessCoord(pipe=5, data=15, model=2): 382, ProcessCoord(pipe=5, data=15, model=3): 383, ProcessCoord(pipe=6, data=0, model=0): 384, ProcessCoord(pipe=6, data=0, model=1): 385, ProcessCoord(pipe=6, data=0, model=2): 386, ProcessCoord(pipe=6, data=0, model=3): 387, ProcessCoord(pipe=6, data=1, model=0): 388, ProcessCoord(pipe=6, data=1, model=1): 389, ProcessCoord(pipe=6, data=1, model=2): 390, ProcessCoord(pipe=6, data=1, model=3): 391, ProcessCoord(pipe=6, data=2, model=0): 392, ProcessCoord(pipe=6, data=2, model=1): 393, ProcessCoord(pipe=6, data=2, model=2): 394, ProcessCoord(pipe=6, data=2, model=3): 395, ProcessCoord(pipe=6, data=3, model=0): 396, ProcessCoord(pipe=6, data=3, model=1): 397, ProcessCoord(pipe=6, data=3, model=2): 398, ProcessCoord(pipe=6, data=3, model=3): 399, ProcessCoord(pipe=6, data=4, model=0): 400, ProcessCoord(pipe=6, data=4, model=1): 401, ProcessCoord(pipe=6, data=4, model=2): 402, ProcessCoord(pipe=6, data=4, model=3): 403, ProcessCoord(pipe=6, data=5, model=0): 404, ProcessCoord(pipe=6, data=5, model=1): 405, ProcessCoord(pipe=6, data=5, model=2): 406, ProcessCoord(pipe=6, data=5, model=3): 407, ProcessCoord(pipe=6, data=6, model=0): 408, ProcessCoord(pipe=6, data=6, model=1): 409, ProcessCoord(pipe=6, data=6, model=2): 410, ProcessCoord(pipe=6, data=6, model=3): 411, ProcessCoord(pipe=6, data=7, model=0): 412, ProcessCoord(pipe=6, data=7, model=1): 413, ProcessCoord(pipe=6, data=7, model=2): 414, ProcessCoord(pipe=6, data=7, model=3): 415, ProcessCoord(pipe=6, data=8, model=0): 416, ProcessCoord(pipe=6, data=8, model=1): 417, ProcessCoord(pipe=6, data=8, model=2): 418, ProcessCoord(pipe=6, data=8, model=3): 419, ProcessCoord(pipe=6, data=9, model=0): 420, ProcessCoord(pipe=6, data=9, model=1): 421, ProcessCoord(pipe=6, data=9, model=2): 422, ProcessCoord(pipe=6, data=9, model=3): 423, ProcessCoord(pipe=6, data=10, model=0): 424, ProcessCoord(pipe=6, data=10, model=1): 425, ProcessCoord(pipe=6, data=10, model=2): 426, ProcessCoord(pipe=6, data=10, model=3): 427, ProcessCoord(pipe=6, data=11, model=0): 428, ProcessCoord(pipe=6, data=11, model=1): 429, ProcessCoord(pipe=6, data=11, model=2): 430, ProcessCoord(pipe=6, data=11, model=3): 431, ProcessCoord(pipe=6, data=12, model=0): 432, ProcessCoord(pipe=6, data=12, model=1): 433, ProcessCoord(pipe=6, data=12, model=2): 434, ProcessCoord(pipe=6, data=12, model=3): 435, ProcessCoord(pipe=6, data=13, model=0): 436, ProcessCoord(pipe=6, data=13, model=1): 437, ProcessCoord(pipe=6, data=13, model=2): 438, ProcessCoord(pipe=6, data=13, model=3): 439, ProcessCoord(pipe=6, data=14, model=0): 440, ProcessCoord(pipe=6, data=14, model=1): 441, ProcessCoord(pipe=6, data=14, model=2): 442, ProcessCoord(pipe=6, data=14, model=3): 443, ProcessCoord(pipe=6, data=15, model=0): 444, ProcessCoord(pipe=6, data=15, model=1): 445, ProcessCoord(pipe=6, data=15, model=2): 446, ProcessCoord(pipe=6, data=15, model=3): 447, ProcessCoord(pipe=7, data=0, model=0): 448, ProcessCoord(pipe=7, data=0, model=1): 449, ProcessCoord(pipe=7, data=0, model=2): 450, ProcessCoord(pipe=7, data=0, model=3): 451, ProcessCoord(pipe=7, data=1, model=0): 452, ProcessCoord(pipe=7, data=1, model=1): 453, ProcessCoord(pipe=7, data=1, model=2): 454, ProcessCoord(pipe=7, data=1, model=3): 455, ProcessCoord(pipe=7, data=2, model=0): 456, ProcessCoord(pipe=7, data=2, model=1): 457, ProcessCoord(pipe=7, data=2, model=2): 458, ProcessCoord(pipe=7, data=2, model=3): 459, ProcessCoord(pipe=7, data=3, model=0): 460, ProcessCoord(pipe=7, data=3, model=1): 461, ProcessCoord(pipe=7, data=3, model=2): 462, ProcessCoord(pipe=7, data=3, model=3): 463, ProcessCoord(pipe=7, data=4, model=0): 464, ProcessCoord(pipe=7, data=4, model=1): 465, ProcessCoord(pipe=7, data=4, model=2): 466, ProcessCoord(pipe=7, data=4, model=3): 467, ProcessCoord(pipe=7, data=5, model=0): 468, ProcessCoord(pipe=7, data=5, model=1): 469, ProcessCoord(pipe=7, data=5, model=2): 470, ProcessCoord(pipe=7, data=5, model=3): 471, ProcessCoord(pipe=7, data=6, model=0): 472, ProcessCoord(pipe=7, data=6, model=1): 473, ProcessCoord(pipe=7, data=6, model=2): 474, ProcessCoord(pipe=7, data=6, model=3): 475, ProcessCoord(pipe=7, data=7, model=0): 476, ProcessCoord(pipe=7, data=7, model=1): 477, ProcessCoord(pipe=7, data=7, model=2): 478, ProcessCoord(pipe=7, data=7, model=3): 479, ProcessCoord(pipe=7, data=8, model=0): 480, ProcessCoord(pipe=7, data=8, model=1): 481, ProcessCoord(pipe=7, data=8, model=2): 482, ProcessCoord(pipe=7, data=8, model=3): 483, ProcessCoord(pipe=7, data=9, model=0): 484, ProcessCoord(pipe=7, data=9, model=1): 485, ProcessCoord(pipe=7, data=9, model=2): 486, ProcessCoord(pipe=7, data=9, model=3): 487, ProcessCoord(pipe=7, data=10, model=0): 488, ProcessCoord(pipe=7, data=10, model=1): 489, ProcessCoord(pipe=7, data=10, model=2): 490, ProcessCoord(pipe=7, data=10, model=3): 491, ProcessCoord(pipe=7, data=11, model=0): 492, ProcessCoord(pipe=7, data=11, model=1): 493, ProcessCoord(pipe=7, data=11, model=2): 494, ProcessCoord(pipe=7, data=11, model=3): 495, ProcessCoord(pipe=7, data=12, model=0): 496, ProcessCoord(pipe=7, data=12, model=1): 497, ProcessCoord(pipe=7, data=12, model=2): 498, ProcessCoord(pipe=7, data=12, model=3): 499, ProcessCoord(pipe=7, data=13, model=0): 500, ProcessCoord(pipe=7, data=13, model=1): 501, ProcessCoord(pipe=7, data=13, model=2): 502, ProcessCoord(pipe=7, data=13, model=3): 503, ProcessCoord(pipe=7, data=14, model=0): 504, ProcessCoord(pipe=7, data=14, model=1): 505, ProcessCoord(pipe=7, data=14, model=2): 506, ProcessCoord(pipe=7, data=14, model=3): 507, ProcessCoord(pipe=7, data=15, model=0): 508, ProcessCoord(pipe=7, data=15, model=1): 509, ProcessCoord(pipe=7, data=15, model=2): 510, ProcessCoord(pipe=7, data=15, model=3): 511} -[2021-09-27 03:54:57,678] [INFO] [module.py:360:_partition_layers] Partitioning pipeline stages with method type:transformer -stage=0 layers=7 - 0: _to_float16 - 1: EmbeddingPipe - 2: - 3: ParallelTransformerLayerPipe - 4: ParallelTransformerLayerPipe - 5: ParallelTransformerLayerPipe - 6: ParallelTransformerLayerPipe -stage=1 layers=4 - 7: ParallelTransformerLayerPipe - 8: ParallelTransformerLayerPipe - 9: ParallelTransformerLayerPipe - 10: ParallelTransformerLayerPipe -stage=2 layers=4 - 11: ParallelTransformerLayerPipe - 12: ParallelTransformerLayerPipe - 13: ParallelTransformerLayerPipe - 14: ParallelTransformerLayerPipe -stage=3 layers=4 - 15: ParallelTransformerLayerPipe - 16: ParallelTransformerLayerPipe - 17: ParallelTransformerLayerPipe - 18: ParallelTransformerLayerPipe -stage=4 layers=4 - 19: ParallelTransformerLayerPipe - 20: ParallelTransformerLayerPipe - 21: ParallelTransformerLayerPipe - 22: ParallelTransformerLayerPipe -stage=5 layers=4 - 23: ParallelTransformerLayerPipe - 24: ParallelTransformerLayerPipe - 25: ParallelTransformerLayerPipe - 26: ParallelTransformerLayerPipe -stage=6 layers=4 - 27: ParallelTransformerLayerPipe - 28: ParallelTransformerLayerPipe - 29: ParallelTransformerLayerPipe - 30: ParallelTransformerLayerPipe -stage=7 layers=8 - 31: ParallelTransformerLayerPipe - 32: ParallelTransformerLayerPipe - 33: ParallelTransformerLayerPipe - 34: ParallelTransformerLayerPipe - 35: - 36: MixedFusedLayerNorm - 37: EmbeddingPipe - 38: float16_to_fp32 - loss: CrossEntropy - > number of parameters on (tensor, pipeline) model parallel rank (2, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 5): 1745293312 > number of parameters on (tensor, pipeline) model parallel rank (3, 5): 1745293312 - - > number of parameters on (tensor, pipeline) model parallel rank (2, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (2, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (1, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (0, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (1, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 7): 1986498560 -[2021-09-27 03:54:59,504] [INFO] [utils.py:680:see_memory_usage] After Building Model -[2021-09-27 03:54:59,505] [INFO] [utils.py:681:see_memory_usage] MA 3.77 GB Max_MA 3.79 GB CA 3.79 GB Max_CA 4 GB -[2021-09-27 03:54:59,505] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 37.03 GB, percent = 19.8% - > number of parameters on (tensor, pipeline) model parallel rank (0, 0): 1986465792 -setting training iterations to 159576 -> learning rate decay style: cosine -DeepSpeed is enabled. -[2021-09-27 03:54:59,644] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.4.2+bc17042, git-hash=bc17042, git-branch=big-science -[2021-09-27 03:54:59,753] [INFO] [engine.py:179:__init__] DeepSpeed Flops Profiler Enabled: False -[2021-09-27 03:54:59,753] [INFO] [engine.py:736:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer -[2021-09-27 03:54:59,753] [INFO] [engine.py:741:_configure_optimizer] Using client Optimizer as basic optimizer -[2021-09-27 03:54:59,753] [INFO] [engine.py:750:_configure_optimizer] DeepSpeed Basic Optimizer = FusedAdam -[2021-09-27 03:54:59,753] [INFO] [utils.py:43:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= -[2021-09-27 03:54:59,753] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 1 optimizer -[2021-09-27 03:54:59,753] [INFO] [stage2.py:106:__init__] Reduce bucket size 500000000 -[2021-09-27 03:54:59,753] [INFO] [stage2.py:107:__init__] Allgather bucket size 500000000 -[2021-09-27 03:54:59,753] [INFO] [stage2.py:108:__init__] CPU Offload: False -[2021-09-27 03:54:59,753] [INFO] [stage2.py:109:__init__] Round robin gradient partitioning: False -[2021-09-27 03:55:04,471] [INFO] [stage2.py:419:__init__] optimizer state initialized -[2021-09-27 03:55:04,471] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam -[2021-09-27 03:55:04,471] [INFO] [engine.py:553:_configure_lr_scheduler] DeepSpeed using client LR scheduler -[2021-09-27 03:55:04,471] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = -[2021-09-27 03:55:04,471] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999)] -[2021-09-27 03:55:04,471] [INFO] [config.py:900:print] DeepSpeedEngine configuration: -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] activation_checkpointing_config { - "partition_activations": false, - "contiguous_memory_optimization": false, - "cpu_checkpointing": false, - "number_checkpoints": null, - "synchronize_checkpoint_boundary": false, - "profile": false -} -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] allreduce_always_fp32 ........ False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] amp_enabled .................. False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] amp_params ................... False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] checkpoint_tag_validation_enabled True -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] checkpoint_tag_validation_fail False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] disable_allgather ............ False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] dump_state ................... False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 500, 'delayed_shift': 2, 'min_scale': 1} -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] eigenvalue_enabled ........... False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] eigenvalue_gas_boundary_resolution 1 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] eigenvalue_layer_name ........ bert.encoder.layer -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] eigenvalue_layer_num ......... 0 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] eigenvalue_max_iter .......... 100 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] eigenvalue_stability ......... 1e-06 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] eigenvalue_tol ............... 0.01 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] eigenvalue_verbose ........... False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] elasticity_enabled ........... False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] flops_profiler_config ........ { - "enabled": false, - "profile_step": 1, - "module_depth": -1, - "top_modules": 1, - "detailed": true, - "output_file": null -} -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] fp16_enabled ................. True -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] fp16_mixed_quantize .......... False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] global_rank .................. 0 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] gradient_accumulation_steps .. 128 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] gradient_clipping ............ 1.0 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] gradient_predivide_factor .... 1.0 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] initial_dynamic_scale ........ 4096 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] loss_scale ................... 0 -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] memory_breakdown ............. False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] optimizer_legacy_fusion ...... False -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] optimizer_name ............... None -[2021-09-27 03:55:04,472] [INFO] [config.py:904:print] optimizer_params ............. None -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] pld_enabled .................. False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] pld_params ................... False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] prescale_gradients ........... False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_change_rate ......... 0.001 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_groups .............. 1 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_offset .............. 1000 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_period .............. 1000 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_rounding ............ 0 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_start_bits .......... 16 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_target_bits ......... 8 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_training_enabled .... False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_type ................ 0 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] quantize_verbose ............. False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] scheduler_name ............... None -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] scheduler_params ............. None -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] sparse_attention ............. None -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] sparse_gradients_enabled ..... False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] steps_per_print .............. 2000 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] tensorboard_enabled .......... False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] tensorboard_job_name ......... DeepSpeedJobName -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] tensorboard_output_path ...... -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] train_batch_size ............. 2048 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] train_micro_batch_size_per_gpu 1 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] use_quantizer_kernel ......... False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] wall_clock_breakdown ......... False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] world_size ................... 16 -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] zero_allow_untested_optimizer False -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] zero_config .................. { - "stage": 1, - "contiguous_gradients": false, - "reduce_scatter": true, - "reduce_bucket_size": 5.000000e+08, - "allgather_partitions": true, - "allgather_bucket_size": 5.000000e+08, - "overlap_comm": false, - "load_from_fp32_weights": true, - "elastic_checkpoint": true, - "offload_param": null, - "offload_optimizer": null, - "sub_group_size": 1.000000e+09, - "prefetch_bucket_size": 5.000000e+07, - "param_persistence_threshold": 1.000000e+05, - "max_live_parameters": 1.000000e+09, - "max_reuse_distance": 1.000000e+09, - "gather_fp16_weights_on_model_save": false, - "ignore_unused_parameters": true, - "round_robin_gradients": false, - "legacy_stage1": false -} -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] zero_enabled ................. True -[2021-09-27 03:55:04,473] [INFO] [config.py:904:print] zero_optimization_stage ...... 1 -[2021-09-27 03:55:04,473] [INFO] [config.py:906:print] json = { - "train_micro_batch_size_per_gpu": 1, - "train_batch_size": 2.048000e+03, - "gradient_clipping": 1.0, - "zero_optimization": { - "stage": 1 - }, - "fp16": { - "enabled": true, - "loss_scale": 0, - "loss_scale_window": 500, - "hysteresis": 2, - "min_loss_scale": 1, - "initial_scale_power": 12 - }, - "steps_per_print": 2.000000e+03, - "wall_clock_breakdown": false -} -[2021-09-27 03:55:04,474] [INFO] [engine.py:76:__init__] CONFIG: micro_batches=128 micro_batch_size=1 -[2021-09-27 03:55:04,910] [INFO] [engine.py:134:__init__] RANK=0 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,910] [INFO] [engine.py:134:__init__] RANK=2 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,910] [INFO] [engine.py:134:__init__] RANK=1 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=3 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=259 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=256 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=258 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=257 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=130 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=129 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=131 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=128 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=384 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=385 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=386 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=387 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=194 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=195 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=193 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=192 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=449 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=448 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=451 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=321 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=320 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=322 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=323 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=66 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=67 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=64 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=450 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 03:55:04,911] [INFO] [engine.py:134:__init__] RANK=65 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) - > using checkpoint value 6e-05 for learning rate - > using checkpoint value 6e-06 for minimum learning rate - > using checkpoint value 216320 for warmup iterations - > using checkpoint value 126953125 for total number of iterations - > using checkpoint value cosine for decay style -successfully loaded 8 ZeRO state_dicts for rank 384 -successfully loaded 8 ZeRO state_dicts for rank 424 -successfully loaded 8 ZeRO state_dicts for rank 444 -successfully loaded 8 ZeRO state_dicts for rank 400 -successfully loaded 8 ZeRO state_dicts for rank 261 -successfully loaded 8 ZeRO state_dicts for rank 432 -successfully loaded 8 ZeRO state_dicts for rank 420 -successfully loaded 8 ZeRO state_dicts for rank 152 -successfully loaded 8 ZeRO state_dicts for rank 440 -successfully loaded 8 ZeRO state_dicts for rank 387 -successfully loaded 8 ZeRO state_dicts for rank 296 -successfully loaded 8 ZeRO state_dicts for rank 392 -successfully loaded 8 ZeRO state_dicts for rank 196 -successfully loaded 8 ZeRO state_dicts for rank 338 -successfully loaded 8 ZeRO state_dicts for rank 379 -successfully loaded 8 ZeRO state_dicts for rank 336 -loading 8 zero partition checkpoints for rank 384 -successfully loaded 8 ZeRO state_dicts for rank 385 -successfully loaded 8 ZeRO state_dicts for rank 445 -successfully loaded 8 ZeRO state_dicts for rank 84 -successfully loaded 8 ZeRO state_dicts for rank 86 -successfully loaded 8 ZeRO state_dicts for rank 428 -successfully loaded 8 ZeRO state_dicts for rank 337 -successfully loaded 8 ZeRO state_dicts for rank 416 -successfully loaded 8 ZeRO state_dicts for rank 436 -loading 8 zero partition checkpoints for rank 424 -successfully loaded 8 ZeRO state_dicts for rank 88 -loading 8 zero partition checkpoints for rank 444 -successfully loaded 8 ZeRO state_dicts for rank 376 -successfully loaded 8 ZeRO state_dicts for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 197 -successfully loaded 8 ZeRO state_dicts for rank 198 -successfully loaded 8 ZeRO state_dicts for rank 388 -successfully loaded 8 ZeRO state_dicts for rank 238 -successfully loaded 8 ZeRO state_dicts for rank 248 -successfully loaded 8 ZeRO state_dicts for rank 174 -loading 8 zero partition checkpoints for rank 261 -successfully loaded 8 ZeRO state_dicts for rank 250 -loading 8 zero partition checkpoints for rank 400 -successfully loaded 8 ZeRO state_dicts for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 277 -successfully loaded 8 ZeRO state_dicts for rank 437 -successfully loaded 8 ZeRO state_dicts for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 297 -loading 8 zero partition checkpoints for rank 432 -successfully loaded 8 ZeRO state_dicts for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 99 -successfully loaded 8 ZeRO state_dicts for rank 194 -successfully loaded 8 ZeRO state_dicts for rank 199 -successfully loaded 8 ZeRO state_dicts for rank 382 -successfully loaded 8 ZeRO state_dicts for rank 332 -successfully loaded 8 ZeRO state_dicts for rank 245 -successfully loaded 8 ZeRO state_dicts for rank 441 -successfully loaded 8 ZeRO state_dicts for rank 299 -successfully loaded 8 ZeRO state_dicts for rank 242 -successfully loaded 8 ZeRO state_dicts for rank 391 -loading 8 zero partition checkpoints for rank 420 -successfully loaded 8 ZeRO state_dicts for rank 234 -successfully loaded 8 ZeRO state_dicts for rank 380 -successfully loaded 8 ZeRO state_dicts for rank 433 -successfully loaded 8 ZeRO state_dicts for rank 423 -successfully loaded 8 ZeRO state_dicts for rank 425 -loading 8 zero partition checkpoints for rank 440 -loading 8 zero partition checkpoints for rank 152 -successfully loaded 8 ZeRO state_dicts for rank 232 -successfully loaded 8 ZeRO state_dicts for rank 246 -successfully loaded 8 ZeRO state_dicts for rank 401 -successfully loaded 8 ZeRO state_dicts for rank 89 -successfully loaded 8 ZeRO state_dicts for rank 241 -successfully loaded 8 ZeRO state_dicts for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 394 -successfully loaded 8 ZeRO state_dicts for rank 178 -successfully loaded 8 ZeRO state_dicts for rank 257 -successfully loaded 8 ZeRO state_dicts for rank 429 -successfully loaded 8 ZeRO state_dicts for rank 422 -successfully loaded 8 ZeRO state_dicts for rank 265 -successfully loaded 8 ZeRO state_dicts for rank 340 -successfully loaded 8 ZeRO state_dicts for rank 256 -successfully loaded 8 ZeRO state_dicts for rank 229 -successfully loaded 8 ZeRO state_dicts for rank 218 -loading 8 zero partition checkpoints for rank 387 -successfully loaded 8 ZeRO state_dicts for rank 421 -successfully loaded 8 ZeRO state_dicts for rank 121 -successfully loaded 8 ZeRO state_dicts for rank 153 -successfully loaded 8 ZeRO state_dicts for rank 182 -successfully loaded 8 ZeRO state_dicts for rank 447 -successfully loaded 8 ZeRO state_dicts for rank 216 -successfully loaded 8 ZeRO state_dicts for rank 237 -successfully loaded 8 ZeRO state_dicts for rank 403 -successfully loaded 8 ZeRO state_dicts for rank 378 -successfully loaded 8 ZeRO state_dicts for rank 341 -successfully loaded 8 ZeRO state_dicts for rank 389 -successfully loaded 8 ZeRO state_dicts for rank 367 -successfully loaded 8 ZeRO state_dicts for rank 236 -successfully loaded 8 ZeRO state_dicts for rank 292 -successfully loaded 8 ZeRO state_dicts for rank 298 -successfully loaded 8 ZeRO state_dicts for rank 393 -successfully loaded 8 ZeRO state_dicts for rank 126 -successfully loaded 8 ZeRO state_dicts for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 383 -successfully loaded 8 ZeRO state_dicts for rank 446 -successfully loaded 8 ZeRO state_dicts for rank 366 -successfully loaded 8 ZeRO state_dicts for rank 443 -loading 8 zero partition checkpoints for rank 392 -successfully loaded 8 ZeRO state_dicts for rank 278 -successfully loaded 8 ZeRO state_dicts for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 69 -successfully loaded 8 ZeRO state_dicts for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 386 -successfully loaded 8 ZeRO state_dicts for rank 408 -loading 8 zero partition checkpoints for rank 338 -successfully loaded 8 ZeRO state_dicts for rank 109 -loading 8 zero partition checkpoints for rank 196 -successfully loaded 8 ZeRO state_dicts for rank 154 -successfully loaded 8 ZeRO state_dicts for rank 430 -successfully loaded 8 ZeRO state_dicts for rank 342 -successfully loaded 8 ZeRO state_dicts for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 123 -successfully loaded 8 ZeRO state_dicts for rank 339 -successfully loaded 8 ZeRO state_dicts for rank 233 -successfully loaded 8 ZeRO state_dicts for rank 235 -successfully loaded 8 ZeRO state_dicts for rank 279 -successfully loaded 8 ZeRO state_dicts for rank 285 -successfully loaded 8 ZeRO state_dicts for rank 219 -successfully loaded 8 ZeRO state_dicts for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 90 -successfully loaded 8 ZeRO state_dicts for rank 249 -successfully loaded 8 ZeRO state_dicts for rank 343 -successfully loaded 8 ZeRO state_dicts for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 450 -successfully loaded 8 ZeRO state_dicts for rank 313 -successfully loaded 8 ZeRO state_dicts for rank 293 -successfully loaded 8 ZeRO state_dicts for rank 381 -successfully loaded 8 ZeRO state_dicts for rank 364 -successfully loaded 8 ZeRO state_dicts for rank 251 -successfully loaded 8 ZeRO state_dicts for rank 65 -loading 8 zero partition checkpoints for rank 379 -successfully loaded 8 ZeRO state_dicts for rank 266 -successfully loaded 8 ZeRO state_dicts for rank 365 -loading 8 zero partition checkpoints for rank 296 -successfully loaded 8 ZeRO state_dicts for rank 442 -successfully loaded 8 ZeRO state_dicts for rank 243 -successfully loaded 8 ZeRO state_dicts for rank 431 -successfully loaded 8 ZeRO state_dicts for rank 276 -successfully loaded 8 ZeRO state_dicts for rank 175 -successfully loaded 8 ZeRO state_dicts for rank 435 -successfully loaded 8 ZeRO state_dicts for rank 309 -loading 8 zero partition checkpoints for rank 336 -successfully loaded 8 ZeRO state_dicts for rank 335 -successfully loaded 8 ZeRO state_dicts for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 412 -successfully loaded 8 ZeRO state_dicts for rank 217 -successfully loaded 8 ZeRO state_dicts for rank 438 -successfully loaded 8 ZeRO state_dicts for rank 426 -successfully loaded 8 ZeRO state_dicts for rank 317 -successfully loaded 8 ZeRO state_dicts for rank 176 -successfully loaded 8 ZeRO state_dicts for rank 260 -successfully loaded 8 ZeRO state_dicts for rank 240 -successfully loaded 8 ZeRO state_dicts for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 120 -successfully loaded 8 ZeRO state_dicts for rank 354 -successfully loaded 8 ZeRO state_dicts for rank 239 -successfully loaded 8 ZeRO state_dicts for rank 228 -successfully loaded 8 ZeRO state_dicts for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 289 -successfully loaded 8 ZeRO state_dicts for rank 70 -successfully loaded 8 ZeRO state_dicts for rank 247 -successfully loaded 8 ZeRO state_dicts for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 439 -loading 8 zero partition checkpoints for rank 84 -loading 8 zero partition checkpoints for rank 385 -successfully loaded 8 ZeRO state_dicts for rank 173 -successfully loaded 8 ZeRO state_dicts for rank 396 -successfully loaded 8 ZeRO state_dicts for rank 355 -successfully loaded 8 ZeRO state_dicts for rank 141 -successfully loaded 8 ZeRO state_dicts for rank 8 -successfully loaded 8 ZeRO state_dicts for rank 164 -successfully loaded 8 ZeRO state_dicts for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 177 -successfully loaded 8 ZeRO state_dicts for rank 312 -successfully loaded 8 ZeRO state_dicts for rank 244 -successfully loaded 8 ZeRO state_dicts for rank 252 -successfully loaded 8 ZeRO state_dicts for rank 369 -successfully loaded 8 ZeRO state_dicts for rank 149 -successfully loaded 8 ZeRO state_dicts for rank 351 -successfully loaded 8 ZeRO state_dicts for rank 167 -loading 8 zero partition checkpoints for rank 337 -successfully loaded 8 ZeRO state_dicts for rank 79 -successfully loaded 8 ZeRO state_dicts for rank 334 -successfully loaded 8 ZeRO state_dicts for rank 390 -successfully loaded 8 ZeRO state_dicts for rank 427 -successfully loaded 8 ZeRO state_dicts for rank 122 -successfully loaded 8 ZeRO state_dicts for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 64 -successfully loaded 8 ZeRO state_dicts for rank 97 -loading 8 zero partition checkpoints for rank 436 -successfully loaded 8 ZeRO state_dicts for rank 263 -successfully loaded 8 ZeRO state_dicts for rank 142 -successfully loaded 8 ZeRO state_dicts for rank 68 -successfully loaded 8 ZeRO state_dicts for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 377 -successfully loaded 8 ZeRO state_dicts for rank 352 -loading 8 zero partition checkpoints for rank 376 -successfully loaded 8 ZeRO state_dicts for rank 195 -successfully loaded 8 ZeRO state_dicts for rank 231 -successfully loaded 8 ZeRO state_dicts for rank 291 -successfully loaded 8 ZeRO state_dicts for rank 77 -loading 8 zero partition checkpoints for rank 445 -loading 8 zero partition checkpoints for rank 428 -successfully loaded 8 ZeRO state_dicts for rank 290 -loading 8 zero partition checkpoints for rank 416 -successfully loaded 8 ZeRO state_dicts for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 414 -successfully loaded 8 ZeRO state_dicts for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 262 -successfully loaded 8 ZeRO state_dicts for rank 468 -successfully loaded 8 ZeRO state_dicts for rank 395 -loading 8 zero partition checkpoints for rank 198 -loading 8 zero partition checkpoints for rank 388 -successfully loaded 8 ZeRO state_dicts for rank 87 -successfully loaded 8 ZeRO state_dicts for rank 253 -loading 8 zero partition checkpoints for rank 88 -successfully loaded 8 ZeRO state_dicts for rank 189 -successfully loaded 8 ZeRO state_dicts for rank 205 -successfully loaded 8 ZeRO state_dicts for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 404 -successfully loaded 8 ZeRO state_dicts for rank 417 -successfully loaded 8 ZeRO state_dicts for rank 130 -successfully loaded 8 ZeRO state_dicts for rank 288 -successfully loaded 8 ZeRO state_dicts for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 179 -successfully loaded 8 ZeRO state_dicts for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 402 -successfully loaded 8 ZeRO state_dicts for rank 349 -successfully loaded 8 ZeRO state_dicts for rank 188 -successfully loaded 8 ZeRO state_dicts for rank 410 -successfully loaded 8 ZeRO state_dicts for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 101 -successfully loaded 8 ZeRO state_dicts for rank 398 -successfully loaded 8 ZeRO state_dicts for rank 281 -successfully loaded 8 ZeRO state_dicts for rank 254 -successfully loaded 8 ZeRO state_dicts for rank 474 -successfully loaded 8 ZeRO state_dicts for rank 333 -successfully loaded 8 ZeRO state_dicts for rank 358 -successfully loaded 8 ZeRO state_dicts for rank 363 -successfully loaded 8 ZeRO state_dicts for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 82 -successfully loaded 8 ZeRO state_dicts for rank 80 -successfully loaded 8 ZeRO state_dicts for rank 471 -successfully loaded 8 ZeRO state_dicts for rank 453 -successfully loaded 8 ZeRO state_dicts for rank 345 -successfully loaded 8 ZeRO state_dicts for rank 81 -successfully loaded 8 ZeRO state_dicts for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 85 -successfully loaded 8 ZeRO state_dicts for rank 434 -successfully loaded 8 ZeRO state_dicts for rank 267 -successfully loaded 8 ZeRO state_dicts for rank 230 -loading 8 zero partition checkpoints for rank 197 -successfully loaded 8 ZeRO state_dicts for rank 295 -successfully loaded 8 ZeRO state_dicts for rank 353 -loading 8 zero partition checkpoints for rank 437 -successfully loaded 8 ZeRO state_dicts for rank 273 -successfully loaded 8 ZeRO state_dicts for rank 202 -successfully loaded 8 ZeRO state_dicts for rank 36 -successfully loaded 8 ZeRO state_dicts for rank 470 -successfully loaded 8 ZeRO state_dicts for rank 357 -successfully loaded 8 ZeRO state_dicts for rank 151 -successfully loaded 8 ZeRO state_dicts for rank 301 -successfully loaded 8 ZeRO state_dicts for rank 315 -loading 8 zero partition checkpoints for rank 174 -successfully loaded 8 ZeRO state_dicts for rank 209 -successfully loaded 8 ZeRO state_dicts for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 160 -loading 8 zero partition checkpoints for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 104 -loading 8 zero partition checkpoints for rank 248 -successfully loaded 8 ZeRO state_dicts for rank 370 -successfully loaded 8 ZeRO state_dicts for rank 311 -successfully loaded 8 ZeRO state_dicts for rank 11 -successfully loaded 8 ZeRO state_dicts for rank 478 -successfully loaded 8 ZeRO state_dicts for rank 227 -successfully loaded 8 ZeRO state_dicts for rank 183 -successfully loaded 8 ZeRO state_dicts for rank 272 -successfully loaded 8 ZeRO state_dicts for rank 255 -loading 8 zero partition checkpoints for rank 194 -successfully loaded 8 ZeRO state_dicts for rank 9 -loading 8 zero partition checkpoints for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 93 -successfully loaded 8 ZeRO state_dicts for rank 399 -successfully loaded 8 ZeRO state_dicts for rank 451 -successfully loaded 8 ZeRO state_dicts for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 200 -successfully loaded 8 ZeRO state_dicts for rank 316 -loading 8 zero partition checkpoints for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 91 -successfully loaded 8 ZeRO state_dicts for rank 73 -loading 8 zero partition checkpoints for rank 441 -successfully loaded 8 ZeRO state_dicts for rank 418 -successfully loaded 8 ZeRO state_dicts for rank 448 -successfully loaded 8 ZeRO state_dicts for rank 187 -successfully loaded 8 ZeRO state_dicts for rank 356 -successfully loaded 8 ZeRO state_dicts for rank 269 -loading 8 zero partition checkpoints for rank 299 -successfully loaded 8 ZeRO state_dicts for rank 131 -successfully loaded 8 ZeRO state_dicts for rank 361 -loading 8 zero partition checkpoints for rank 277 -successfully loaded 8 ZeRO state_dicts for rank 39 -successfully loaded 8 ZeRO state_dicts for rank 350 -loading 8 zero partition checkpoints for rank 391 -loading 8 zero partition checkpoints for rank 297 -successfully loaded 8 ZeRO state_dicts for rank 107 -loading 8 zero partition checkpoints for rank 234 -loading 8 zero partition checkpoints for rank 242 -successfully loaded 8 ZeRO state_dicts for rank 318 -successfully loaded 8 ZeRO state_dicts for rank 373 -successfully loaded 8 ZeRO state_dicts for rank 475 -successfully loaded 8 ZeRO state_dicts for rank 103 -successfully loaded 8 ZeRO state_dicts for rank 472 -successfully loaded 8 ZeRO state_dicts for rank 221 -successfully loaded 8 ZeRO state_dicts for rank 210 -loading 8 zero partition checkpoints for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 368 -successfully loaded 8 ZeRO state_dicts for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 268 -successfully loaded 8 ZeRO state_dicts for rank 456 -successfully loaded 8 ZeRO state_dicts for rank 455 -successfully loaded 8 ZeRO state_dicts for rank 321 -successfully loaded 8 ZeRO state_dicts for rank 462 -successfully loaded 8 ZeRO state_dicts for rank 284 -successfully loaded 8 ZeRO state_dicts for rank 117 -successfully loaded 8 ZeRO state_dicts for rank 41 -loading 8 zero partition checkpoints for rank 394 -successfully loaded 8 ZeRO state_dicts for rank 359 -successfully loaded 8 ZeRO state_dicts for rank 375 -successfully loaded 8 ZeRO state_dicts for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 181 -loading 8 zero partition checkpoints for rank 423 -successfully loaded 8 ZeRO state_dicts for rank 10 -successfully loaded 8 ZeRO state_dicts for rank 100 -successfully loaded 8 ZeRO state_dicts for rank 191 -loading 8 zero partition checkpoints for rank 178 -successfully loaded 8 ZeRO state_dicts for rank 294 -loading 8 zero partition checkpoints for rank 332 -successfully loaded 8 ZeRO state_dicts for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 371 -loading 8 zero partition checkpoints for rank 401 -successfully loaded 8 ZeRO state_dicts for rank 203 -successfully loaded 8 ZeRO state_dicts for rank 37 -successfully loaded 8 ZeRO state_dicts for rank 324 -loading 8 zero partition checkpoints for rank 241 -loading 8 zero partition checkpoints for rank 422 -loading 8 zero partition checkpoints for rank 199 -successfully loaded 8 ZeRO state_dicts for rank 35 -successfully loaded 8 ZeRO state_dicts for rank 322 -successfully loaded 8 ZeRO state_dicts for rank 258 -successfully loaded 8 ZeRO state_dicts for rank 329 -successfully loaded 8 ZeRO state_dicts for rank 222 -successfully loaded 8 ZeRO state_dicts for rank 460 -loading 8 zero partition checkpoints for rank 380 -loading 8 zero partition checkpoints for rank 421 -successfully loaded 8 ZeRO state_dicts for rank 323 -loading 8 zero partition checkpoints for rank 256 -loading 8 zero partition checkpoints for rank 433 -loading 8 zero partition checkpoints for rank 229 -successfully loaded 8 ZeRO state_dicts for rank 302 -loading 8 zero partition checkpoints for rank 265 -successfully loaded 8 ZeRO state_dicts for rank 74 -successfully loaded 8 ZeRO state_dicts for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 223 -successfully loaded 8 ZeRO state_dicts for rank 225 -loading 8 zero partition checkpoints for rank 153 -successfully loaded 8 ZeRO state_dicts for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 138 -successfully loaded 8 ZeRO state_dicts for rank 190 -loading 8 zero partition checkpoints for rank 246 -successfully loaded 8 ZeRO state_dicts for rank 118 -successfully loaded 8 ZeRO state_dicts for rank 406 -successfully loaded 8 ZeRO state_dicts for rank 413 -successfully loaded 8 ZeRO state_dicts for rank 397 -successfully loaded 8 ZeRO state_dicts for rank 264 -loading 8 zero partition checkpoints for rank 429 -successfully loaded 8 ZeRO state_dicts for rank 275 -loading 8 zero partition checkpoints for rank 237 -loading 8 zero partition checkpoints for rank 403 -loading 8 zero partition checkpoints for rank 378 -loading 8 zero partition checkpoints for rank 232 -successfully loaded 8 ZeRO state_dicts for rank 71 -loading 8 zero partition checkpoints for rank 257 -loading 8 zero partition checkpoints for rank 389 -successfully loaded 8 ZeRO state_dicts for rank 115 -successfully loaded 8 ZeRO state_dicts for rank 111 -successfully loaded 8 ZeRO state_dicts for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 66 -successfully loaded 8 ZeRO state_dicts for rank 213 -successfully loaded 8 ZeRO state_dicts for rank 186 -successfully loaded 8 ZeRO state_dicts for rank 43 -successfully loaded 8 ZeRO state_dicts for rank 304 -successfully loaded 8 ZeRO state_dicts for rank 211 -loading 8 zero partition checkpoints for rank 393 -successfully loaded 8 ZeRO state_dicts for rank 347 -loading 8 zero partition checkpoints for rank 443 -loading 8 zero partition checkpoints for rank 386 -successfully loaded 8 ZeRO state_dicts for rank 314 -successfully loaded 8 ZeRO state_dicts for rank 208 -successfully loaded 8 ZeRO state_dicts for rank 459 -successfully loaded 8 ZeRO state_dicts for rank 165 -successfully loaded 8 ZeRO state_dicts for rank 419 -loading 8 zero partition checkpoints for rank 278 -successfully loaded 8 ZeRO state_dicts for rank 83 -successfully loaded 8 ZeRO state_dicts for rank 362 -loading 8 zero partition checkpoints for rank 367 -loading 8 zero partition checkpoints for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 163 -successfully loaded 8 ZeRO state_dicts for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 116 -successfully loaded 8 ZeRO state_dicts for rank 303 -successfully loaded 8 ZeRO state_dicts for rank 374 -loading 8 zero partition checkpoints for rank 126 -loading 8 zero partition checkpoints for rank 339 -successfully loaded 8 ZeRO state_dicts for rank 274 -loading 8 zero partition checkpoints for rank 292 -loading 8 zero partition checkpoints for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 114 -loading 8 zero partition checkpoints for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 372 -successfully loaded 8 ZeRO state_dicts for rank 449 -successfully loaded 8 ZeRO state_dicts for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 409 -loading 8 zero partition checkpoints for rank 69 -loading 8 zero partition checkpoints for rank 298 -loading 8 zero partition checkpoints for rank 123 -successfully loaded 8 ZeRO state_dicts for rank 3 -loading 8 zero partition checkpoints for rank 366 -loading 8 zero partition checkpoints for rank 279 -successfully loaded 8 ZeRO state_dicts for rank 47 -successfully loaded 8 ZeRO state_dicts for rank 259 -successfully loaded 8 ZeRO state_dicts for rank 479 -loading 8 zero partition checkpoints for rank 235 -successfully loaded 8 ZeRO state_dicts for rank 67 -loading 8 zero partition checkpoints for rank 447 -successfully loaded 8 ZeRO state_dicts for rank 95 -successfully loaded 8 ZeRO state_dicts for rank 270 -loading 8 zero partition checkpoints for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 129 -successfully loaded 8 ZeRO state_dicts for rank 75 -successfully loaded 8 ZeRO state_dicts for rank 466 -successfully loaded 8 ZeRO state_dicts for rank 226 -loading 8 zero partition checkpoints for rank 216 -successfully loaded 8 ZeRO state_dicts for rank 224 -successfully loaded 8 ZeRO state_dicts for rank 280 -loading 8 zero partition checkpoints for rank 285 -loading 8 zero partition checkpoints for rank 341 -successfully loaded 8 ZeRO state_dicts for rank 92 -loading 8 zero partition checkpoints for rank 251 -successfully loaded 8 ZeRO state_dicts for rank 29 -successfully loaded 8 ZeRO state_dicts for rank 411 -successfully loaded 8 ZeRO state_dicts for rank 507 -loading 8 zero partition checkpoints for rank 408 -successfully loaded 8 ZeRO state_dicts for rank 171 -loading 8 zero partition checkpoints for rank 446 -successfully loaded 8 ZeRO state_dicts for rank 146 -loading 8 zero partition checkpoints for rank 340 -loading 8 zero partition checkpoints for rank 245 -loading 8 zero partition checkpoints for rank 430 -successfully loaded 8 ZeRO state_dicts for rank 327 -successfully loaded 8 ZeRO state_dicts for rank 331 -loading 8 zero partition checkpoints for rank 381 -loading 8 zero partition checkpoints for rank 364 -successfully loaded 8 ZeRO state_dicts for rank 20 -loading 8 zero partition checkpoints for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 282 -loading 8 zero partition checkpoints for rank 233 -loading 8 zero partition checkpoints for rank 243 -successfully loaded 8 ZeRO state_dicts for rank 452 -loading 8 zero partition checkpoints for rank 431 -successfully loaded 8 ZeRO state_dicts for rank 305 -successfully loaded 8 ZeRO state_dicts for rank 21 -successfully loaded 8 ZeRO state_dicts for rank 169 -loading 8 zero partition checkpoints for rank 86 -loading 8 zero partition checkpoints for rank 442 -successfully loaded 8 ZeRO state_dicts for rank 330 -loading 8 zero partition checkpoints for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 286 -loading 8 zero partition checkpoints for rank 175 -successfully loaded 8 ZeRO state_dicts for rank 326 -successfully loaded 8 ZeRO state_dicts for rank 454 -loading 8 zero partition checkpoints for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 476 -successfully loaded 8 ZeRO state_dicts for rank 102 -successfully loaded 8 ZeRO state_dicts for rank 300 -loading 8 zero partition checkpoints for rank 250 -successfully loaded 8 ZeRO state_dicts for rank 1 -loading 8 zero partition checkpoints for rank 435 -successfully loaded 8 ZeRO state_dicts for rank 15 -successfully loaded 8 ZeRO state_dicts for rank 38 -successfully loaded 8 ZeRO state_dicts for rank 328 -successfully loaded 8 ZeRO state_dicts for rank 0 -loading 8 zero partition checkpoints for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 463 -loading 8 zero partition checkpoints for rank 219 -successfully loaded 8 ZeRO state_dicts for rank 320 -loading 8 zero partition checkpoints for rank 218 -successfully loaded 8 ZeRO state_dicts for rank 56 -successfully loaded 8 ZeRO state_dicts for rank 271 -loading 8 zero partition checkpoints for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 287 -loading 8 zero partition checkpoints for rank 309 -successfully loaded 8 ZeRO state_dicts for rank 19 -successfully loaded 8 ZeRO state_dicts for rank 24 -successfully loaded 8 ZeRO state_dicts for rank 27 -successfully loaded 8 ZeRO state_dicts for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 415 -successfully loaded 8 ZeRO state_dicts for rank 310 -loading 8 zero partition checkpoints for rank 365 -loading 8 zero partition checkpoints for rank 240 -successfully loaded 8 ZeRO state_dicts for rank 78 -loading 8 zero partition checkpoints for rank 260 -loading 8 zero partition checkpoints for rank 342 -loading 8 zero partition checkpoints for rank 313 -loading 8 zero partition checkpoints for rank 438 -successfully loaded 8 ZeRO state_dicts for rank 161 -successfully loaded 8 ZeRO state_dicts for rank 308 -successfully loaded 8 ZeRO state_dicts for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 344 -loading 8 zero partition checkpoints for rank 289 -successfully loaded 8 ZeRO state_dicts for rank 185 -successfully loaded 8 ZeRO state_dicts for rank 49 -successfully loaded 8 ZeRO state_dicts for rank 57 -successfully loaded 8 ZeRO state_dicts for rank 484 -successfully loaded 8 ZeRO state_dicts for rank 487 -loading 8 zero partition checkpoints for rank 252 -successfully loaded 8 ZeRO state_dicts for rank 348 -loading 8 zero partition checkpoints for rank 239 -successfully loaded 8 ZeRO state_dicts for rank 25 -loading 8 zero partition checkpoints for rank 120 -loading 8 zero partition checkpoints for rank 276 -loading 8 zero partition checkpoints for rank 425 -loading 8 zero partition checkpoints for rank 382 -loading 8 zero partition checkpoints for rank 173 -successfully loaded 8 ZeRO state_dicts for rank 494 -successfully loaded 8 ZeRO state_dicts for rank 162 -successfully loaded 8 ZeRO state_dicts for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 504 -successfully loaded 8 ZeRO state_dicts for rank 407 -successfully loaded 8 ZeRO state_dicts for rank 13 -loading 8 zero partition checkpoints for rank 247 -loading 8 zero partition checkpoints for rank 390 -successfully loaded 8 ZeRO state_dicts for rank 319 -loading 8 zero partition checkpoints for rank 377 -loading 8 zero partition checkpoints for rank 64 -loading 8 zero partition checkpoints for rank 351 -successfully loaded 8 ZeRO state_dicts for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 306 -loading 8 zero partition checkpoints for rank 412 -loading 8 zero partition checkpoints for rank 195 -loading 8 zero partition checkpoints for rank 369 -loading 8 zero partition checkpoints for rank 439 -loading 8 zero partition checkpoints for rank 121 -loading 8 zero partition checkpoints for rank 343 -successfully loaded 8 ZeRO state_dicts for rank 405 -successfully loaded 8 ZeRO state_dicts for rank 106 -loading 8 zero partition checkpoints for rank 154 -loading 8 zero partition checkpoints for rank 396 -loading 8 zero partition checkpoints for rank 167 -loading 8 zero partition checkpoints for rank 231 -loading 8 zero partition checkpoints for rank 352 -loading 8 zero partition checkpoints for rank 238 -successfully loaded 8 ZeRO state_dicts for rank 283 -loading 8 zero partition checkpoints for rank 182 -loading 8 zero partition checkpoints for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 346 -successfully loaded 8 ZeRO state_dicts for rank 170 -loading 8 zero partition checkpoints for rank 217 -loading 8 zero partition checkpoints for rank 193 -loading 8 zero partition checkpoints for rank 141 -successfully loaded 8 ZeRO state_dicts for rank 33 -loading 8 zero partition checkpoints for rank 263 -successfully loaded 8 ZeRO state_dicts for rank 145 -successfully loaded 8 ZeRO state_dicts for rank 473 -successfully loaded 8 ZeRO state_dicts for rank 98 -loading 8 zero partition checkpoints for rank 149 -loading 8 zero partition checkpoints for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 360 -loading 8 zero partition checkpoints for rank 177 -successfully loaded 8 ZeRO state_dicts for rank 510 -loading 8 zero partition checkpoints for rank 249 -successfully loaded 8 ZeRO state_dicts for rank 490 -successfully loaded 8 ZeRO state_dicts for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 325 -successfully loaded 8 ZeRO state_dicts for rank 133 -loading 8 zero partition checkpoints for rank 262 -successfully loaded 8 ZeRO state_dicts for rank 506 -loading 8 zero partition checkpoints for rank 395 -successfully loaded 8 ZeRO state_dicts for rank 147 -loading 8 zero partition checkpoints for rank 99 -loading 8 zero partition checkpoints for rank 312 -loading 8 zero partition checkpoints for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 458 -loading 8 zero partition checkpoints for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 119 -successfully loaded 8 ZeRO state_dicts for rank 134 -loading 8 zero partition checkpoints for rank 164 -loading 8 zero partition checkpoints for rank 205 -loading 8 zero partition checkpoints for rank 212 -loading 8 zero partition checkpoints for rank 288 -loading 8 zero partition checkpoints for rank 335 -loading 8 zero partition checkpoints for rank 383 -loading 8 zero partition checkpoints for rank 189 -loading 8 zero partition checkpoints for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 51 -loading 8 zero partition checkpoints for rank 236 -loading 8 zero partition checkpoints for rank 77 -loading 8 zero partition checkpoints for rank 188 -loading 8 zero partition checkpoints for rank 410 -successfully loaded 8 ZeRO state_dicts for rank 55 -loading 8 zero partition checkpoints for rank 148 -loading 8 zero partition checkpoints for rank 417 -loading 8 zero partition checkpoints for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 457 -successfully loaded 8 ZeRO state_dicts for rank 2 -successfully loaded 8 ZeRO state_dicts for rank 42 -loading 8 zero partition checkpoints for rank 402 -successfully loaded 8 ZeRO state_dicts for rank 500 -loading 8 zero partition checkpoints for rank 434 -successfully loaded 8 ZeRO state_dicts for rank 94 -loading 8 zero partition checkpoints for rank 82 -loading 8 zero partition checkpoints for rank 358 -successfully loaded 8 ZeRO state_dicts for rank 491 -loading 8 zero partition checkpoints for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 16 -successfully loaded 8 ZeRO state_dicts for rank 58 -loading 8 zero partition checkpoints for rank 65 -loading 8 zero partition checkpoints for rank 349 -loading 8 zero partition checkpoints for rank 295 -successfully loaded 8 ZeRO state_dicts for rank 17 -successfully loaded 8 ZeRO state_dicts for rank 464 -loading 8 zero partition checkpoints for rank 404 -loading 8 zero partition checkpoints for rank 363 -successfully loaded 8 ZeRO state_dicts for rank 499 -successfully loaded 8 ZeRO state_dicts for rank 461 -successfully loaded 8 ZeRO state_dicts for rank 44 -successfully loaded 8 ZeRO state_dicts for rank 50 -successfully loaded 8 ZeRO state_dicts for rank 477 -successfully loaded 8 ZeRO state_dicts for rank 45 -loading 8 zero partition checkpoints for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 30 -loading 8 zero partition checkpoints for rank 345 -successfully loaded 8 ZeRO state_dicts for rank 53 -loading 8 zero partition checkpoints for rank 353 -successfully loaded 8 ZeRO state_dicts for rank 23 -loading 8 zero partition checkpoints for rank 334 -loading 8 zero partition checkpoints for rank 315 -loading 8 zero partition checkpoints for rank 333 -loading 8 zero partition checkpoints for rank 273 -loading 8 zero partition checkpoints for rank 8 -successfully loaded 8 ZeRO state_dicts for rank 503 -successfully loaded 8 ZeRO state_dicts for rank 12 -loading 8 zero partition checkpoints for rank 244 -loading 8 zero partition checkpoints for rank 151 -loading 8 zero partition checkpoints for rank 370 -successfully loaded 8 ZeRO state_dicts for rank 59 -successfully loaded 8 ZeRO state_dicts for rank 31 -loading 8 zero partition checkpoints for rank 311 -loading 8 zero partition checkpoints for rank 426 -successfully loaded 8 ZeRO state_dicts for rank 486 -loading 8 zero partition checkpoints for rank 399 -successfully loaded 8 ZeRO state_dicts for rank 26 -loading 8 zero partition checkpoints for rank 474 -loading 8 zero partition checkpoints for rank 200 -successfully loaded 8 ZeRO state_dicts for rank 54 -loading 8 zero partition checkpoints for rank 101 -successfully loaded 8 ZeRO state_dicts for rank 46 -loading 8 zero partition checkpoints for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 498 -successfully loaded 8 ZeRO state_dicts for rank 307 -loading 8 zero partition checkpoints for rank 471 -successfully loaded 8 ZeRO state_dicts for rank 469 -successfully loaded 8 ZeRO state_dicts for rank 495 -successfully loaded 8 ZeRO state_dicts for rank 22 -loading 8 zero partition checkpoints for rank 104 -loading 8 zero partition checkpoints for rank 272 -successfully loaded 8 ZeRO state_dicts for rank 28 -loading 8 zero partition checkpoints for rank 91 -loading 8 zero partition checkpoints for rank 160 -loading 8 zero partition checkpoints for rank 354 -loading 8 zero partition checkpoints for rank 267 -successfully loaded 8 ZeRO state_dicts for rank 467 -loading 8 zero partition checkpoints for rank 317 -loading 8 zero partition checkpoints for rank 361 -loading 8 zero partition checkpoints for rank 281 -loading 8 zero partition checkpoints for rank 73 -loading 8 zero partition checkpoints for rank 103 -loading 8 zero partition checkpoints for rank 107 -successfully loaded 8 ZeRO state_dicts for rank 34 -loading 8 zero partition checkpoints for rank 202 -loading 8 zero partition checkpoints for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 14 -loading 8 zero partition checkpoints for rank 255 -successfully loaded 8 ZeRO state_dicts for rank 482 -loading 8 zero partition checkpoints for rank 293 -loading 8 zero partition checkpoints for rank 220 -loading 8 zero partition checkpoints for rank 368 -loading 8 zero partition checkpoints for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 483 -loading 8 zero partition checkpoints for rank 269 -loading 8 zero partition checkpoints for rank 355 -loading 8 zero partition checkpoints for rank 168 -loading 8 zero partition checkpoints for rank 427 -loading 8 zero partition checkpoints for rank 318 -loading 8 zero partition checkpoints for rank 284 -loading 8 zero partition checkpoints for rank 122 -loading 8 zero partition checkpoints for rank 93 -loading 8 zero partition checkpoints for rank 418 -loading 8 zero partition checkpoints for rank 191 -loading 8 zero partition checkpoints for rank 203 -loading 8 zero partition checkpoints for rank 359 -loading 8 zero partition checkpoints for rank 291 -loading 8 zero partition checkpoints for rank 207 -loading 8 zero partition checkpoints for rank 268 -loading 8 zero partition checkpoints for rank 316 -loading 8 zero partition checkpoints for rank 187 -loading 8 zero partition checkpoints for rank 371 -loading 8 zero partition checkpoints for rank 131 -loading 8 zero partition checkpoints for rank 97 -successfully loaded 8 ZeRO state_dicts for rank 502 -loading 8 zero partition checkpoints for rank 398 -loading 8 zero partition checkpoints for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 18 -successfully loaded 8 ZeRO state_dicts for rank 508 -loading 8 zero partition checkpoints for rank 215 -loading 8 zero partition checkpoints for rank 290 -successfully loaded 8 ZeRO state_dicts for rank 497 -successfully loaded 8 ZeRO state_dicts for rank 496 -loading 8 zero partition checkpoints for rank 117 -loading 8 zero partition checkpoints for rank 138 -successfully loaded 8 ZeRO state_dicts for rank 493 -loading 8 zero partition checkpoints for rank 79 -loading 8 zero partition checkpoints for rank 181 -loading 8 zero partition checkpoints for rank 209 -successfully loaded 8 ZeRO state_dicts for rank 488 -successfully loaded 8 ZeRO state_dicts for rank 485 -loading 8 zero partition checkpoints for rank 89 -loading 8 zero partition checkpoints for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 489 -successfully loaded 8 ZeRO state_dicts for rank 501 -loading 8 zero partition checkpoints for rank 176 -loading 8 zero partition checkpoints for rank 468 -loading 8 zero partition checkpoints for rank 143 -loading 8 zero partition checkpoints for rank 100 -loading 8 zero partition checkpoints for rank 223 -loading 8 zero partition checkpoints for rank 87 -loading 8 zero partition checkpoints for rank 74 -loading 8 zero partition checkpoints for rank 258 -successfully loaded 8 ZeRO state_dicts for rank 480 -loading 8 zero partition checkpoints for rank 406 -loading 8 zero partition checkpoints for rank 183 -loading 8 zero partition checkpoints for rank 190 -loading 8 zero partition checkpoints for rank 275 -loading 8 zero partition checkpoints for rank 71 -loading 8 zero partition checkpoints for rank 85 -loading 8 zero partition checkpoints for rank 329 -successfully loaded 8 ZeRO state_dicts for rank 511 -loading 8 zero partition checkpoints for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 492 -loading 8 zero partition checkpoints for rank 211 -loading 8 zero partition checkpoints for rank 357 -loading 8 zero partition checkpoints for rank 321 -loading 8 zero partition checkpoints for rank 322 -loading 8 zero partition checkpoints for rank 118 -loading 8 zero partition checkpoints for rank 113 -loading 8 zero partition checkpoints for rank 142 -loading 8 zero partition checkpoints for rank 213 -loading 8 zero partition checkpoints for rank 478 -loading 8 zero partition checkpoints for rank 460 -successfully loaded 8 ZeRO state_dicts for rank 509 -loading 8 zero partition checkpoints for rank 186 -loading 8 zero partition checkpoints for rank 253 -loading 8 zero partition checkpoints for rank 228 -loading 8 zero partition checkpoints for rank 419 -successfully loaded 8 ZeRO state_dicts for rank 505 -loading 8 zero partition checkpoints for rank 266 -loading 8 zero partition checkpoints for rank 413 -loading 8 zero partition checkpoints for rank 254 -loading 8 zero partition checkpoints for rank 470 -loading 8 zero partition checkpoints for rank 9 -loading 8 zero partition checkpoints for rank 11 -loading 8 zero partition checkpoints for rank 184 -loading 8 zero partition checkpoints for rank 274 -loading 8 zero partition checkpoints for rank 324 -loading 8 zero partition checkpoints for rank 314 -loading 8 zero partition checkpoints for rank 362 -loading 8 zero partition checkpoints for rank 294 -loading 8 zero partition checkpoints for rank 90 -loading 8 zero partition checkpoints for rank 409 -loading 8 zero partition checkpoints for rank 41 -loading 8 zero partition checkpoints for rank 450 -loading 8 zero partition checkpoints for rank 448 -loading 8 zero partition checkpoints for rank 259 -loading 8 zero partition checkpoints for rank 179 -loading 8 zero partition checkpoints for rank 270 -loading 8 zero partition checkpoints for rank 356 -loading 8 zero partition checkpoints for rank 165 -successfully loaded 8 ZeRO state_dicts for rank 465 -loading 8 zero partition checkpoints for rank 214 -loading 8 zero partition checkpoints for rank 221 -loading 8 zero partition checkpoints for rank 83 -loading 8 zero partition checkpoints for rank 76 -loading 8 zero partition checkpoints for rank 414 -loading 8 zero partition checkpoints for rank 95 -loading 8 zero partition checkpoints for rank 114 -successfully loaded 8 ZeRO state_dicts for rank 110 -loading 8 zero partition checkpoints for rank 37 -loading 8 zero partition checkpoints for rank 116 -loading 8 zero partition checkpoints for rank 10 -loading 8 zero partition checkpoints for rank 411 -successfully loaded 8 ZeRO state_dicts for rank 481 -loading 8 zero partition checkpoints for rank 331 -loading 8 zero partition checkpoints for rank 397 -loading 8 zero partition checkpoints for rank 222 -loading 8 zero partition checkpoints for rank 230 -loading 8 zero partition checkpoints for rank 326 -loading 8 zero partition checkpoints for rank 102 -loading 8 zero partition checkpoints for rank 286 -loading 8 zero partition checkpoints for rank 75 -loading 8 zero partition checkpoints for rank 287 -loading 8 zero partition checkpoints for rank 453 -loading 8 zero partition checkpoints for rank 163 -loading 8 zero partition checkpoints for rank 280 -loading 8 zero partition checkpoints for rank 305 -loading 8 zero partition checkpoints for rank 271 -loading 8 zero partition checkpoints for rank 62 -loading 8 zero partition checkpoints for rank 78 -loading 8 zero partition checkpoints for rank 144 -loading 8 zero partition checkpoints for rank 282 -loading 8 zero partition checkpoints for rank 310 -loading 8 zero partition checkpoints for rank 456 -loading 8 zero partition checkpoints for rank 308 -loading 8 zero partition checkpoints for rank 92 -loading 8 zero partition checkpoints for rank 66 -loading 8 zero partition checkpoints for rank 161 -loading 8 zero partition checkpoints for rank 47 -loading 8 zero partition checkpoints for rank 472 -loading 8 zero partition checkpoints for rank 43 -loading 8 zero partition checkpoints for rank 350 -loading 8 zero partition checkpoints for rank 372 -loading 8 zero partition checkpoints for rank 35 -loading 8 zero partition checkpoints for rank 130 -loading 8 zero partition checkpoints for rank 70 -loading 8 zero partition checkpoints for rank 60 -loading 8 zero partition checkpoints for rank 1 -loading 8 zero partition checkpoints for rank 38 -loading 8 zero partition checkpoints for rank 374 -loading 8 zero partition checkpoints for rank 29 -loading 8 zero partition checkpoints for rank 407 -loading 8 zero partition checkpoints for rank 210 -loading 8 zero partition checkpoints for rank 67 -loading 8 zero partition checkpoints for rank 171 -loading 8 zero partition checkpoints for rank 80 -loading 8 zero partition checkpoints for rank 449 -loading 8 zero partition checkpoints for rank 106 -loading 8 zero partition checkpoints for rank 81 -loading 8 zero partition checkpoints for rank 347 -loading 8 zero partition checkpoints for rank 479 -loading 8 zero partition checkpoints for rank 405 -loading 8 zero partition checkpoints for rank 346 -loading 8 zero partition checkpoints for rank 98 -loading 8 zero partition checkpoints for rank 283 -loading 8 zero partition checkpoints for rank 264 -loading 8 zero partition checkpoints for rank 415 -loading 8 zero partition checkpoints for rank 68 -loading 8 zero partition checkpoints for rank 475 -loading 8 zero partition checkpoints for rank 40 -loading 8 zero partition checkpoints for rank 145 -loading 8 zero partition checkpoints for rank 27 -loading 8 zero partition checkpoints for rank 24 -loading 8 zero partition checkpoints for rank 162 -loading 8 zero partition checkpoints for rank 459 -loading 8 zero partition checkpoints for rank 134 -loading 8 zero partition checkpoints for rank 25 -loading 8 zero partition checkpoints for rank 56 -loading 8 zero partition checkpoints for rank 109 -loading 8 zero partition checkpoints for rank 303 -loading 8 zero partition checkpoints for rank 119 -loading 8 zero partition checkpoints for rank 115 -loading 8 zero partition checkpoints for rank 319 -loading 8 zero partition checkpoints for rank 185 -loading 8 zero partition checkpoints for rank 208 -loading 8 zero partition checkpoints for rank 13 -loading 8 zero partition checkpoints for rank 476 -loading 8 zero partition checkpoints for rank 375 -loading 8 zero partition checkpoints for rank 348 -loading 8 zero partition checkpoints for rank 57 -loading 8 zero partition checkpoints for rank 360 -loading 8 zero partition checkpoints for rank 33 -loading 8 zero partition checkpoints for rank 15 -loading 8 zero partition checkpoints for rank 328 -loading 8 zero partition checkpoints for rank 330 -loading 8 zero partition checkpoints for rank 129 -loading 8 zero partition checkpoints for rank 323 -loading 8 zero partition checkpoints for rank 327 -loading 8 zero partition checkpoints for rank 21 -loading 8 zero partition checkpoints for rank 487 -loading 8 zero partition checkpoints for rank 112 -loading 8 zero partition checkpoints for rank 373 -loading 8 zero partition checkpoints for rank 506 -loading 8 zero partition checkpoints for rank 504 -loading 8 zero partition checkpoints for rank 48 -loading 8 zero partition checkpoints for rank 510 -loading 8 zero partition checkpoints for rank 301 -loading 8 zero partition checkpoints for rank 344 -loading 8 zero partition checkpoints for rank 42 -loading 8 zero partition checkpoints for rank 2 -loading 8 zero partition checkpoints for rank 300 -loading 8 zero partition checkpoints for rank 320 -loading 8 zero partition checkpoints for rank 51 -loading 8 zero partition checkpoints for rank 20 -loading 8 zero partition checkpoints for rank 462 -loading 8 zero partition checkpoints for rank 36 -loading 8 zero partition checkpoints for rank 304 -loading 8 zero partition checkpoints for rank 500 -loading 8 zero partition checkpoints for rank 473 -loading 8 zero partition checkpoints for rank 461 -loading 8 zero partition checkpoints for rank 307 -loading 8 zero partition checkpoints for rank 491 -loading 8 zero partition checkpoints for rank 451 -loading 8 zero partition checkpoints for rank 45 -loading 8 zero partition checkpoints for rank 325 -loading 8 zero partition checkpoints for rank 507 -loading 8 zero partition checkpoints for rank 23 -loading 8 zero partition checkpoints for rank 44 -loading 8 zero partition checkpoints for rank 32 -loading 8 zero partition checkpoints for rank 52 -loading 8 zero partition checkpoints for rank 30 -loading 8 zero partition checkpoints for rank 53 -loading 8 zero partition checkpoints for rank 477 -loading 8 zero partition checkpoints for rank 94 -loading 8 zero partition checkpoints for rank 58 -loading 8 zero partition checkpoints for rank 31 -loading 8 zero partition checkpoints for rank 59 -loading 8 zero partition checkpoints for rank 39 -loading 8 zero partition checkpoints for rank 26 -loading 8 zero partition checkpoints for rank 146 -loading 8 zero partition checkpoints for rank 452 -loading 8 zero partition checkpoints for rank 302 -loading 8 zero partition checkpoints for rank 28 -loading 8 zero partition checkpoints for rank 17 -loading 8 zero partition checkpoints for rank 108 -loading 8 zero partition checkpoints for rank 469 -loading 8 zero partition checkpoints for rank 169 -loading 8 zero partition checkpoints for rank 46 -loading 8 zero partition checkpoints for rank 306 -loading 8 zero partition checkpoints for rank 490 -loading 8 zero partition checkpoints for rank 22 -loading 8 zero partition checkpoints for rank 55 -loading 8 zero partition checkpoints for rank 464 -loading 8 zero partition checkpoints for rank 457 -loading 8 zero partition checkpoints for rank 463 -loading 8 zero partition checkpoints for rank 458 -loading 8 zero partition checkpoints for rank 170 -loading 8 zero partition checkpoints for rank 34 -loading 8 zero partition checkpoints for rank 147 -loading 8 zero partition checkpoints for rank 488 -loading 8 zero partition checkpoints for rank 18 -loading 8 zero partition checkpoints for rank 485 -loading 8 zero partition checkpoints for rank 111 -loading 8 zero partition checkpoints for rank 501 -loading 8 zero partition checkpoints for rank 493 -loading 8 zero partition checkpoints for rank 455 -loading 8 zero partition checkpoints for rank 225 -loading 8 zero partition checkpoints for rank 12 -loading 8 zero partition checkpoints for rank 467 -loading 8 zero partition checkpoints for rank 49 -loading 8 zero partition checkpoints for rank 14 -loading 8 zero partition checkpoints for rank 492 -loading 8 zero partition checkpoints for rank 3 -loading 8 zero partition checkpoints for rank 54 -loading 8 zero partition checkpoints for rank 454 -loading 8 zero partition checkpoints for rank 227 -loading 8 zero partition checkpoints for rank 19 -loading 8 zero partition checkpoints for rank 509 -loading 8 zero partition checkpoints for rank 489 -loading 8 zero partition checkpoints for rank 0 -loading 8 zero partition checkpoints for rank 226 - checkpoint version 3.0 -loading 8 zero partition checkpoints for rank 466 -loading 8 zero partition checkpoints for rank 499 -loading 8 zero partition checkpoints for rank 484 -loading 8 zero partition checkpoints for rank 224 -loading 8 zero partition checkpoints for rank 135 -loading 8 zero partition checkpoints for rank 50 -loading 8 zero partition checkpoints for rank 110 -loading 8 zero partition checkpoints for rank 505 -loading 8 zero partition checkpoints for rank 497 -loading 8 zero partition checkpoints for rank 496 -loading 8 zero partition checkpoints for rank 498 -loading 8 zero partition checkpoints for rank 494 -loading 8 zero partition checkpoints for rank 133 -loading 8 zero partition checkpoints for rank 486 -loading 8 zero partition checkpoints for rank 16 -loading 8 zero partition checkpoints for rank 495 -loading 8 zero partition checkpoints for rank 503 -loading 8 zero partition checkpoints for rank 465 -loading 8 zero partition checkpoints for rank 502 -loading 8 zero partition checkpoints for rank 508 -loading 8 zero partition checkpoints for rank 511 -loading 8 zero partition checkpoints for rank 480 -successfully loaded 8 ZeRO state_dicts for rank 5 -loading 8 zero partition checkpoints for rank 482 -loading 8 zero partition checkpoints for rank 483 -loading 8 zero partition checkpoints for rank 481 -successfully loaded 8 ZeRO state_dicts for rank 6 -successfully loaded 8 ZeRO state_dicts for rank 4 -successfully loaded 8 ZeRO state_dicts for rank 7 -loading 8 zero partition checkpoints for rank 5 -loading 8 zero partition checkpoints for rank 4 -loading 8 zero partition checkpoints for rank 6 -loading 8 zero partition checkpoints for rank 7 - successfully loaded checkpoint from /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints at iteration 9768 -time (ms) | load-checkpoint: 91243.56 -[after model, optimizer, and learning rate scheduler are built] datetime: 2021-09-27 03:56:36 -> building train, validation, and test datasets ... - > datasets target sizes (minimum size): - train: 300000000 - validation: 1638400 - test: 10240 -> building train, validation, and test datasets for GPT ... - > building dataset index ... - reading sizes... - reading pointers... - reading document index... - creating numpy buffer of mmap... - creating memory view of numpy buffer... - > finished creating indexed dataset in 0.143013 seconds - number of documents: 304230423 - > dataset split: - train: - document indices in [0, 288714672) total of 288714672 documents - validation: - document indices in [288714672, 303926193) total of 15211521 documents - test: - document indices in [303926193, 304230423) total of 304230 documents - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.289 seconds - total number of samples: 394611670 - total number of epochs: 3 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.388 seconds - total number of samples: 6927161 - total number of epochs: 1 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_42s_shuffle_idx.npy - loaded indexed file in 0.061 seconds - total number of samples: 137384 - total number of epochs: 1 -> finished creating GPT datasets ... -[after dataloaders are built] datetime: 2021-09-27 03:56:43 -done with setup ... -training ... -time (ms) | model-and-optimizer-setup: 102057.80 | train/valid/test-data-iterators-setup: 5731.66 -[before the start of training step] datetime: 2021-09-27 03:56:43 -[2021-09-27 03:56:43,457] [INFO] [checkpointing.py:408:forward] Activation Checkpointing Information -[2021-09-27 03:56:43,457] [INFO] [checkpointing.py:409:forward] ----Partition Activations False, CPU CHECKPOINTING False -[2021-09-27 03:56:43,457] [INFO] [checkpointing.py:412:forward] ----contiguous Memory Checkpointing False with 32 total layers -[2021-09-27 03:56:43,457] [INFO] [checkpointing.py:415:forward] ----Synchronization False -[2021-09-27 03:56:43,457] [INFO] [checkpointing.py:416:forward] ----Profiling time in checkpointing False -[Rank 192] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10290.1357421875 | reserved: 15132.0 | max reserved: 15132.0 -[Rank 129] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10562.13623046875 | reserved: 15500.0 | max reserved: 15500.0 -[Rank 130] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10562.13623046875 | reserved: 15364.0 | max reserved: 15364.0 -[Rank 64] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10834.13671875 | reserved: 15820.0 | max reserved: 15820.0 -[Rank 0] (after 9770 iterations) memory (MB) | allocated: 5267.49951171875 | max allocated: 12476.68310546875 | reserved: 18256.0 | max reserved: 18256.0 -[Rank 2] (after 9770 iterations) memory (MB) | allocated: 5267.49951171875 | max allocated: 12476.68310546875 | reserved: 17788.0 | max reserved: 17788.0 -[Rank 256] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10018.13525390625 | reserved: 14812.0 | max reserved: 14812.0 -[Rank 257] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10018.13525390625 | reserved: 14940.0 | max reserved: 14940.0 -[Rank 193] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10290.1357421875 | reserved: 15096.0 | max reserved: 15096.0[Rank 194] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10290.1357421875 | reserved: 15112.0 | max reserved: 15112.0 -[Rank 128] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10562.13623046875 | reserved: 15456.0 | max reserved: 15456.0 -[Rank 385] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 9474.13427734375 | reserved: 14312.0 | max reserved: 14312.0 -[Rank 320] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 9746.134765625 | reserved: 14716.0 | max reserved: 14716.0 -[Rank 65] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10834.13671875 | reserved: 15632.0 | max reserved: 15632.0 -[Rank 1] (after 9770 iterations) memory (MB) | allocated: 5267.49951171875 | max allocated: 12476.68310546875 | reserved: 18256.0 | max reserved: 18256.0 -[Rank 258] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10018.13525390625 | reserved: 14696.0 | max reserved: 14696.0 - -[Rank 131] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10562.13623046875 | reserved: 15532.0 | max reserved: 15532.0 -[Rank 384] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 9474.13427734375 | reserved: 14268.0 | max reserved: 14268.0 -[Rank 449] (after 9770 iterations) memory (MB) | allocated: 5685.35986328125 | max allocated: 10463.337890625 | reserved: 15736.0 | max reserved: 15736.0[Rank 448] (after 9770 iterations) memory (MB) | allocated: 5685.35986328125 | max allocated: 10463.33642578125 | reserved: 15736.0 | max reserved: 15736.0 - -[Rank 322] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 9746.134765625 | reserved: 14616.0 | max reserved: 14616.0 -[Rank 66] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10834.13671875 | reserved: 15828.0 | max reserved: 15828.0 -[Rank 3] (after 9770 iterations) memory (MB) | allocated: 5267.49951171875 | max allocated: 12476.68310546875 | reserved: 18256.0 | max reserved: 18256.0 -[Rank 259] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10018.13525390625 | reserved: 14712.0 | max reserved: 14712.0 -[Rank 195] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10290.1357421875 | reserved: 15208.0 | max reserved: 15208.0 -[Rank 387] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 9474.13427734375 | reserved: 14312.0 | max reserved: 14312.0[Rank 386] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 9474.13427734375 | reserved: 14312.0 | max reserved: 14312.0 -[Rank 451] (after 9770 iterations) memory (MB) | allocated: 5685.35986328125 | max allocated: 10463.3369140625 | reserved: 15736.0 | max reserved: 15736.0 -[Rank 323] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 9746.134765625 | reserved: 14648.0 | max reserved: 14648.0 -[Rank 67] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 10834.13671875 | reserved: 15536.0 | max reserved: 15536.0 - -[Rank 450] (after 9770 iterations) memory (MB) | allocated: 5685.35986328125 | max allocated: 10463.33544921875 | reserved: 15736.0 | max reserved: 15736.0 -[Rank 321] (after 9770 iterations) memory (MB) | allocated: 4613.21923828125 | max allocated: 9746.134765625 | reserved: 14684.0 | max reserved: 14684.0 - iteration 9770/ 159576 | consumed samples: 701760 | elapsed time per iteration (ms): 21146.4 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9780/ 159576 | consumed samples: 704160 | elapsed time per iteration (ms): 13340.2 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9790/ 159576 | consumed samples: 706560 | elapsed time per iteration (ms): 13419.1 | learning rate: 6.000E-05 | global batch size: 240 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9800/ 159576 | consumed samples: 708976 | elapsed time per iteration (ms): 13591.3 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9810/ 159576 | consumed samples: 711536 | elapsed time per iteration (ms): 13986.8 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9820/ 159576 | consumed samples: 714096 | elapsed time per iteration (ms): 14105.8 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9830/ 159576 | consumed samples: 716656 | elapsed time per iteration (ms): 14030.2 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9840/ 159576 | consumed samples: 719216 | elapsed time per iteration (ms): 14188.9 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 04:14:28] PULSE: tr8-104B is running for 20:12 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 9850/ 159576 | consumed samples: 721776 | elapsed time per iteration (ms): 14071.1 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9860/ 159576 | consumed samples: 724336 | elapsed time per iteration (ms): 14125.1 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9870/ 159576 | consumed samples: 726896 | elapsed time per iteration (ms): 14170.2 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9880/ 159576 | consumed samples: 729456 | elapsed time per iteration (ms): 14139.5 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9890/ 159576 | consumed samples: 732016 | elapsed time per iteration (ms): 14156.0 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9900/ 159576 | consumed samples: 734576 | elapsed time per iteration (ms): 14057.9 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9910/ 159576 | consumed samples: 737136 | elapsed time per iteration (ms): 14129.8 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9920/ 159576 | consumed samples: 739696 | elapsed time per iteration (ms): 14157.7 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9930/ 159576 | consumed samples: 742256 | elapsed time per iteration (ms): 14024.1 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9940/ 159576 | consumed samples: 744816 | elapsed time per iteration (ms): 13971.4 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9950/ 159576 | consumed samples: 747376 | elapsed time per iteration (ms): 14101.5 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9960/ 159576 | consumed samples: 749936 | elapsed time per iteration (ms): 14210.0 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9970/ 159576 | consumed samples: 752496 | elapsed time per iteration (ms): 14219.6 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9980/ 159576 | consumed samples: 755056 | elapsed time per iteration (ms): 14117.6 | learning rate: 6.000E-05 | global batch size: 256 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 9990/ 159576 | consumed samples: 757712 | elapsed time per iteration (ms): 14400.0 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 04:51:19,357] [INFO] [logging.py:68:log_dist] [Rank 0] step=10000, skipped=1052, lr=[5.999919375575235e-05, 5.999919375575235e-05], mom=[(0.9, 0.999), (0.9, 0.999)] -steps: 10000 loss: nan iter time (s): 0.007 samples/sec: 37472.688 - iteration 10000/ 159576 | consumed samples: 760432 | elapsed time per iteration (ms): 14648.0 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -------------------------------------------------------------------------------------------------- - validation loss at iteration 10000 | lm loss value: 7.270623E+00 | lm loss PPL: 1.437445E+03 | -------------------------------------------------------------------------------------------------- - iteration 10010/ 159576 | consumed samples: 763152 | elapsed time per iteration (ms): 16469.3 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10020/ 159576 | consumed samples: 765872 | elapsed time per iteration (ms): 14573.2 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10030/ 159576 | consumed samples: 768592 | elapsed time per iteration (ms): 14611.8 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10040/ 159576 | consumed samples: 771312 | elapsed time per iteration (ms): 14782.8 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10050/ 159576 | consumed samples: 774032 | elapsed time per iteration (ms): 14722.8 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10060/ 159576 | consumed samples: 776752 | elapsed time per iteration (ms): 14595.9 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10070/ 159576 | consumed samples: 779472 | elapsed time per iteration (ms): 14712.5 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10080/ 159576 | consumed samples: 782192 | elapsed time per iteration (ms): 14640.3 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10090/ 159576 | consumed samples: 784912 | elapsed time per iteration (ms): 15060.9 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 05:14:32] PULSE: tr8-104B is running for 1:20:16 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 10100/ 159576 | consumed samples: 787632 | elapsed time per iteration (ms): 14624.0 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10110/ 159576 | consumed samples: 790352 | elapsed time per iteration (ms): 14621.7 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10120/ 159576 | consumed samples: 793072 | elapsed time per iteration (ms): 14685.1 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10130/ 159576 | consumed samples: 795792 | elapsed time per iteration (ms): 14531.8 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10140/ 159576 | consumed samples: 798512 | elapsed time per iteration (ms): 14629.6 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10150/ 159576 | consumed samples: 801232 | elapsed time per iteration (ms): 14771.8 | learning rate: 6.000E-05 | global batch size: 272 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10160/ 159576 | consumed samples: 803984 | elapsed time per iteration (ms): 14889.9 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10170/ 159576 | consumed samples: 806864 | elapsed time per iteration (ms): 15471.9 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10180/ 159576 | consumed samples: 809744 | elapsed time per iteration (ms): 15228.6 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10190/ 159576 | consumed samples: 812624 | elapsed time per iteration (ms): 15425.1 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10200/ 159576 | consumed samples: 815504 | elapsed time per iteration (ms): 15390.8 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10210/ 159576 | consumed samples: 818384 | elapsed time per iteration (ms): 15293.9 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10220/ 159576 | consumed samples: 821264 | elapsed time per iteration (ms): 15259.9 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10230/ 159576 | consumed samples: 824144 | elapsed time per iteration (ms): 15547.4 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10240/ 159576 | consumed samples: 827024 | elapsed time per iteration (ms): 15375.5 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10250/ 159576 | consumed samples: 829904 | elapsed time per iteration (ms): 15322.8 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10260/ 159576 | consumed samples: 832784 | elapsed time per iteration (ms): 15280.3 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10270/ 159576 | consumed samples: 835664 | elapsed time per iteration (ms): 15390.4 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10280/ 159576 | consumed samples: 838544 | elapsed time per iteration (ms): 15339.6 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10290/ 159576 | consumed samples: 841424 | elapsed time per iteration (ms): 15252.5 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10300/ 159576 | consumed samples: 844304 | elapsed time per iteration (ms): 15146.5 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10310/ 159576 | consumed samples: 847184 | elapsed time per iteration (ms): 15389.7 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10320/ 159576 | consumed samples: 850064 | elapsed time per iteration (ms): 15348.5 | learning rate: 6.000E-05 | global batch size: 288 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10330/ 159576 | consumed samples: 853072 | elapsed time per iteration (ms): 15779.0 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 06:14:35] PULSE: tr8-104B is running for 2:20:19 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 10340/ 159576 | consumed samples: 856112 | elapsed time per iteration (ms): 15864.8 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10350/ 159576 | consumed samples: 859152 | elapsed time per iteration (ms): 15831.6 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10360/ 159576 | consumed samples: 862192 | elapsed time per iteration (ms): 15954.9 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10370/ 159576 | consumed samples: 865232 | elapsed time per iteration (ms): 15871.6 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10380/ 159576 | consumed samples: 868272 | elapsed time per iteration (ms): 15850.1 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10390/ 159576 | consumed samples: 871312 | elapsed time per iteration (ms): 15796.9 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10400/ 159576 | consumed samples: 874352 | elapsed time per iteration (ms): 16082.6 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10410/ 159576 | consumed samples: 877392 | elapsed time per iteration (ms): 16036.3 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10420/ 159576 | consumed samples: 880432 | elapsed time per iteration (ms): 15898.1 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10430/ 159576 | consumed samples: 883472 | elapsed time per iteration (ms): 15687.4 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10440/ 159576 | consumed samples: 886512 | elapsed time per iteration (ms): 15579.4 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10450/ 159576 | consumed samples: 889552 | elapsed time per iteration (ms): 16071.4 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10460/ 159576 | consumed samples: 892592 | elapsed time per iteration (ms): 15986.9 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10470/ 159576 | consumed samples: 895632 | elapsed time per iteration (ms): 15775.6 | learning rate: 6.000E-05 | global batch size: 304 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10480/ 159576 | consumed samples: 898720 | elapsed time per iteration (ms): 16164.1 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10490/ 159576 | consumed samples: 901920 | elapsed time per iteration (ms): 16520.7 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10500/ 159576 | consumed samples: 905120 | elapsed time per iteration (ms): 16597.6 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 10500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-27 06:59:42,258] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step10500/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 10500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 21886.11 - iteration 10510/ 159576 | consumed samples: 908320 | elapsed time per iteration (ms): 18676.6 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10520/ 159576 | consumed samples: 911520 | elapsed time per iteration (ms): 16429.2 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10530/ 159576 | consumed samples: 914720 | elapsed time per iteration (ms): 16551.8 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10540/ 159576 | consumed samples: 917920 | elapsed time per iteration (ms): 16488.6 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10550/ 159576 | consumed samples: 921120 | elapsed time per iteration (ms): 16385.6 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 07:14:45] PULSE: tr8-104B is running for 3:20:29 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 10560/ 159576 | consumed samples: 924320 | elapsed time per iteration (ms): 16352.3 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10570/ 159576 | consumed samples: 927520 | elapsed time per iteration (ms): 16281.1 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10580/ 159576 | consumed samples: 930720 | elapsed time per iteration (ms): 16433.2 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10590/ 159576 | consumed samples: 933920 | elapsed time per iteration (ms): 16276.4 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10600/ 159576 | consumed samples: 937120 | elapsed time per iteration (ms): 16510.6 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10610/ 159576 | consumed samples: 940320 | elapsed time per iteration (ms): 16415.6 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10620/ 159576 | consumed samples: 943520 | elapsed time per iteration (ms): 16211.4 | learning rate: 6.000E-05 | global batch size: 320 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10630/ 159576 | consumed samples: 946800 | elapsed time per iteration (ms): 16664.6 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10640/ 159576 | consumed samples: 950160 | elapsed time per iteration (ms): 17041.3 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10650/ 159576 | consumed samples: 953520 | elapsed time per iteration (ms): 17363.3 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10660/ 159576 | consumed samples: 956880 | elapsed time per iteration (ms): 16944.5 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10670/ 159576 | consumed samples: 960240 | elapsed time per iteration (ms): 17142.6 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10680/ 159576 | consumed samples: 963600 | elapsed time per iteration (ms): 17139.9 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10690/ 159576 | consumed samples: 966960 | elapsed time per iteration (ms): 17104.6 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10700/ 159576 | consumed samples: 970320 | elapsed time per iteration (ms): 16968.9 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10710/ 159576 | consumed samples: 973680 | elapsed time per iteration (ms): 17071.1 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10720/ 159576 | consumed samples: 977040 | elapsed time per iteration (ms): 16939.7 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10730/ 159576 | consumed samples: 980400 | elapsed time per iteration (ms): 17182.0 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10740/ 159576 | consumed samples: 983760 | elapsed time per iteration (ms): 16947.4 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10750/ 159576 | consumed samples: 987120 | elapsed time per iteration (ms): 16887.4 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10760/ 159576 | consumed samples: 990480 | elapsed time per iteration (ms): 17060.2 | learning rate: 6.000E-05 | global batch size: 336 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 08:14:50] PULSE: tr8-104B is running for 4:20:34 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 10770/ 159576 | consumed samples: 993920 | elapsed time per iteration (ms): 17207.0 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10780/ 159576 | consumed samples: 997440 | elapsed time per iteration (ms): 17439.0 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10790/ 159576 | consumed samples: 1000960 | elapsed time per iteration (ms): 17709.5 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10800/ 159576 | consumed samples: 1004480 | elapsed time per iteration (ms): 17397.4 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10810/ 159576 | consumed samples: 1008000 | elapsed time per iteration (ms): 17515.8 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10820/ 159576 | consumed samples: 1011520 | elapsed time per iteration (ms): 17500.0 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10830/ 159576 | consumed samples: 1015040 | elapsed time per iteration (ms): 17623.4 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10840/ 159576 | consumed samples: 1018560 | elapsed time per iteration (ms): 17764.6 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10850/ 159576 | consumed samples: 1022080 | elapsed time per iteration (ms): 17667.0 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10860/ 159576 | consumed samples: 1025600 | elapsed time per iteration (ms): 17590.6 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10870/ 159576 | consumed samples: 1029120 | elapsed time per iteration (ms): 17626.8 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10880/ 159576 | consumed samples: 1032640 | elapsed time per iteration (ms): 17668.3 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10890/ 159576 | consumed samples: 1036160 | elapsed time per iteration (ms): 17624.1 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10900/ 159576 | consumed samples: 1039680 | elapsed time per iteration (ms): 17793.8 | learning rate: 6.000E-05 | global batch size: 352 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10910/ 159576 | consumed samples: 1043360 | elapsed time per iteration (ms): 18188.2 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10920/ 159576 | consumed samples: 1047040 | elapsed time per iteration (ms): 18317.3 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10930/ 159576 | consumed samples: 1050720 | elapsed time per iteration (ms): 18324.8 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10940/ 159576 | consumed samples: 1054400 | elapsed time per iteration (ms): 18321.8 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10950/ 159576 | consumed samples: 1058080 | elapsed time per iteration (ms): 18321.0 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10960/ 159576 | consumed samples: 1061760 | elapsed time per iteration (ms): 18223.5 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 09:14:51] PULSE: tr8-104B is running for 5:20:35 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 10970/ 159576 | consumed samples: 1065440 | elapsed time per iteration (ms): 18268.5 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10980/ 159576 | consumed samples: 1069120 | elapsed time per iteration (ms): 18399.6 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 10990/ 159576 | consumed samples: 1072800 | elapsed time per iteration (ms): 18217.5 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11000/ 159576 | consumed samples: 1076480 | elapsed time per iteration (ms): 18260.1 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -------------------------------------------------------------------------------------------------- - validation loss at iteration 11000 | lm loss value: 7.284734E+00 | lm loss PPL: 1.457873E+03 | -------------------------------------------------------------------------------------------------- - iteration 11010/ 159576 | consumed samples: 1080160 | elapsed time per iteration (ms): 20666.6 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11020/ 159576 | consumed samples: 1083840 | elapsed time per iteration (ms): 18277.2 | learning rate: 6.000E-05 | global batch size: 368 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11030/ 159576 | consumed samples: 1087552 | elapsed time per iteration (ms): 18419.3 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11040/ 159576 | consumed samples: 1091392 | elapsed time per iteration (ms): 19002.0 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11050/ 159576 | consumed samples: 1095232 | elapsed time per iteration (ms): 18930.9 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11060/ 159576 | consumed samples: 1099072 | elapsed time per iteration (ms): 18821.2 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11070/ 159576 | consumed samples: 1102912 | elapsed time per iteration (ms): 18889.6 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11080/ 159576 | consumed samples: 1106752 | elapsed time per iteration (ms): 18970.4 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11090/ 159576 | consumed samples: 1110592 | elapsed time per iteration (ms): 18822.6 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11100/ 159576 | consumed samples: 1114432 | elapsed time per iteration (ms): 18697.2 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11110/ 159576 | consumed samples: 1118272 | elapsed time per iteration (ms): 18737.4 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11120/ 159576 | consumed samples: 1122112 | elapsed time per iteration (ms): 18949.1 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11130/ 159576 | consumed samples: 1125952 | elapsed time per iteration (ms): 19003.8 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11140/ 159576 | consumed samples: 1129792 | elapsed time per iteration (ms): 18836.8 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11150/ 159576 | consumed samples: 1133632 | elapsed time per iteration (ms): 18941.7 | learning rate: 6.000E-05 | global batch size: 384 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11160/ 159576 | consumed samples: 1137616 | elapsed time per iteration (ms): 19465.1 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 10:14:56] PULSE: tr8-104B is running for 6:20:40 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 11170/ 159576 | consumed samples: 1141616 | elapsed time per iteration (ms): 19493.8 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11180/ 159576 | consumed samples: 1145616 | elapsed time per iteration (ms): 19504.7 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11190/ 159576 | consumed samples: 1149616 | elapsed time per iteration (ms): 19555.2 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11200/ 159576 | consumed samples: 1153616 | elapsed time per iteration (ms): 19490.6 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11210/ 159576 | consumed samples: 1157616 | elapsed time per iteration (ms): 19532.7 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11220/ 159576 | consumed samples: 1161616 | elapsed time per iteration (ms): 19261.8 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11230/ 159576 | consumed samples: 1165616 | elapsed time per iteration (ms): 19376.4 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11240/ 159576 | consumed samples: 1169616 | elapsed time per iteration (ms): 19505.2 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11250/ 159576 | consumed samples: 1173616 | elapsed time per iteration (ms): 19535.4 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11260/ 159576 | consumed samples: 1177616 | elapsed time per iteration (ms): 19415.2 | learning rate: 6.000E-05 | global batch size: 400 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11270/ 159576 | consumed samples: 1181632 | elapsed time per iteration (ms): 19446.5 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11280/ 159576 | consumed samples: 1185792 | elapsed time per iteration (ms): 20068.3 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11290/ 159576 | consumed samples: 1189952 | elapsed time per iteration (ms): 19947.1 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11300/ 159576 | consumed samples: 1194112 | elapsed time per iteration (ms): 20002.0 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11310/ 159576 | consumed samples: 1198272 | elapsed time per iteration (ms): 20006.4 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11320/ 159576 | consumed samples: 1202432 | elapsed time per iteration (ms): 20000.1 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11330/ 159576 | consumed samples: 1206592 | elapsed time per iteration (ms): 20065.5 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11340/ 159576 | consumed samples: 1210752 | elapsed time per iteration (ms): 19952.9 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 11:15:05] PULSE: tr8-104B is running for 7:20:49 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 11350/ 159576 | consumed samples: 1214912 | elapsed time per iteration (ms): 19989.1 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11360/ 159576 | consumed samples: 1219072 | elapsed time per iteration (ms): 19868.7 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11370/ 159576 | consumed samples: 1223232 | elapsed time per iteration (ms): 19987.6 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11380/ 159576 | consumed samples: 1227392 | elapsed time per iteration (ms): 19947.5 | learning rate: 6.000E-05 | global batch size: 416 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11390/ 159576 | consumed samples: 1231664 | elapsed time per iteration (ms): 20206.1 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11400/ 159576 | consumed samples: 1235984 | elapsed time per iteration (ms): 20686.4 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11410/ 159576 | consumed samples: 1240304 | elapsed time per iteration (ms): 20763.5 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11420/ 159576 | consumed samples: 1244624 | elapsed time per iteration (ms): 20718.0 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11430/ 159576 | consumed samples: 1248944 | elapsed time per iteration (ms): 20629.3 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11440/ 159576 | consumed samples: 1253264 | elapsed time per iteration (ms): 20735.7 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11450/ 159576 | consumed samples: 1257584 | elapsed time per iteration (ms): 20551.6 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11460/ 159576 | consumed samples: 1261904 | elapsed time per iteration (ms): 20425.6 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11470/ 159576 | consumed samples: 1266224 | elapsed time per iteration (ms): 20522.3 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11480/ 159576 | consumed samples: 1270544 | elapsed time per iteration (ms): 20523.5 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11490/ 159576 | consumed samples: 1274864 | elapsed time per iteration (ms): 20644.7 | learning rate: 6.000E-05 | global batch size: 432 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11500/ 159576 | consumed samples: 1279312 | elapsed time per iteration (ms): 21082.2 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11510/ 159576 | consumed samples: 1283792 | elapsed time per iteration (ms): 21312.4 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11520/ 159576 | consumed samples: 1288272 | elapsed time per iteration (ms): 21403.7 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11530/ 159576 | consumed samples: 1292752 | elapsed time per iteration (ms): 21133.4 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11540/ 159576 | consumed samples: 1297232 | elapsed time per iteration (ms): 21166.4 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11550/ 159576 | consumed samples: 1301712 | elapsed time per iteration (ms): 21259.6 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 12:27:56] PULSE: tr8-104B is running for 8:33:40 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 11560/ 159576 | consumed samples: 1306192 | elapsed time per iteration (ms): 21050.1 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11570/ 159576 | consumed samples: 1310672 | elapsed time per iteration (ms): 21058.2 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11580/ 159576 | consumed samples: 1315152 | elapsed time per iteration (ms): 21057.7 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11590/ 159576 | consumed samples: 1319632 | elapsed time per iteration (ms): 21281.4 | learning rate: 6.000E-05 | global batch size: 448 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11600/ 159576 | consumed samples: 1324144 | elapsed time per iteration (ms): 21318.5 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11610/ 159576 | consumed samples: 1328784 | elapsed time per iteration (ms): 21769.2 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11620/ 159576 | consumed samples: 1333424 | elapsed time per iteration (ms): 21656.2 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11630/ 159576 | consumed samples: 1338064 | elapsed time per iteration (ms): 21947.9 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11640/ 159576 | consumed samples: 1342704 | elapsed time per iteration (ms): 21602.8 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11650/ 159576 | consumed samples: 1347344 | elapsed time per iteration (ms): 21770.3 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11660/ 159576 | consumed samples: 1351984 | elapsed time per iteration (ms): 21697.2 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11670/ 159576 | consumed samples: 1356624 | elapsed time per iteration (ms): 22004.7 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11680/ 159576 | consumed samples: 1361264 | elapsed time per iteration (ms): 21654.6 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11690/ 159576 | consumed samples: 1365904 | elapsed time per iteration (ms): 21840.4 | learning rate: 6.000E-05 | global batch size: 464 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11700/ 159576 | consumed samples: 1370560 | elapsed time per iteration (ms): 21982.9 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11710/ 159576 | consumed samples: 1375360 | elapsed time per iteration (ms): 22227.6 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11720/ 159576 | consumed samples: 1380160 | elapsed time per iteration (ms): 22533.1 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 13:27:56] PULSE: tr8-104B is running for 9:33:40 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 11730/ 159576 | consumed samples: 1384960 | elapsed time per iteration (ms): 22192.1 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11740/ 159576 | consumed samples: 1389760 | elapsed time per iteration (ms): 22268.7 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11750/ 159576 | consumed samples: 1394560 | elapsed time per iteration (ms): 22268.4 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11760/ 159576 | consumed samples: 1399360 | elapsed time per iteration (ms): 22141.9 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11770/ 159576 | consumed samples: 1404160 | elapsed time per iteration (ms): 21979.0 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11780/ 159576 | consumed samples: 1408960 | elapsed time per iteration (ms): 22172.2 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11790/ 159576 | consumed samples: 1413760 | elapsed time per iteration (ms): 22335.9 | learning rate: 6.000E-05 | global batch size: 480 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11800/ 159576 | consumed samples: 1418592 | elapsed time per iteration (ms): 22588.3 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11810/ 159576 | consumed samples: 1423552 | elapsed time per iteration (ms): 22823.4 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11820/ 159576 | consumed samples: 1428512 | elapsed time per iteration (ms): 22959.2 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11830/ 159576 | consumed samples: 1433472 | elapsed time per iteration (ms): 23080.3 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11840/ 159576 | consumed samples: 1438432 | elapsed time per iteration (ms): 23034.0 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11850/ 159576 | consumed samples: 1443392 | elapsed time per iteration (ms): 23099.6 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11860/ 159576 | consumed samples: 1448352 | elapsed time per iteration (ms): 23031.2 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11870/ 159576 | consumed samples: 1453312 | elapsed time per iteration (ms): 22866.8 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11880/ 159576 | consumed samples: 1458272 | elapsed time per iteration (ms): 23007.5 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 14:27:59] PULSE: tr8-104B is running for 10:33:43 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 11890/ 159576 | consumed samples: 1463232 | elapsed time per iteration (ms): 23034.3 | learning rate: 6.000E-05 | global batch size: 496 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11900/ 159576 | consumed samples: 1468304 | elapsed time per iteration (ms): 23486.5 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11910/ 159576 | consumed samples: 1473424 | elapsed time per iteration (ms): 23540.7 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11920/ 159576 | consumed samples: 1478544 | elapsed time per iteration (ms): 23676.0 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11930/ 159576 | consumed samples: 1483664 | elapsed time per iteration (ms): 23529.7 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11940/ 159576 | consumed samples: 1488784 | elapsed time per iteration (ms): 23604.1 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11950/ 159576 | consumed samples: 1493904 | elapsed time per iteration (ms): 23627.0 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11960/ 159576 | consumed samples: 1499024 | elapsed time per iteration (ms): 23559.5 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11970/ 159576 | consumed samples: 1504144 | elapsed time per iteration (ms): 23611.0 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11980/ 159576 | consumed samples: 1509264 | elapsed time per iteration (ms): 23634.8 | learning rate: 6.000E-05 | global batch size: 512 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 11990/ 159576 | consumed samples: 1514464 | elapsed time per iteration (ms): 23596.0 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 15:14:45,510] [INFO] [logging.py:68:log_dist] [Rank 0] step=12000, skipped=3052, lr=[5.999919375575235e-05, 5.999919375575235e-05], mom=[(0.9, 0.999), (0.9, 0.999)] -steps: 12000 loss: nan iter time (s): 0.012 samples/sec: 43274.454 - iteration 12000/ 159576 | consumed samples: 1519744 | elapsed time per iteration (ms): 24091.4 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -------------------------------------------------------------------------------------------------- - validation loss at iteration 12000 | lm loss value: 7.282808E+00 | lm loss PPL: 1.455068E+03 | -------------------------------------------------------------------------------------------------- -saving checkpoint at iteration 12000 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-27 15:15:22,225] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step12000/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 12000 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 32585.61 - iteration 12010/ 159576 | consumed samples: 1525024 | elapsed time per iteration (ms): 30246.8 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12020/ 159576 | consumed samples: 1530304 | elapsed time per iteration (ms): 24139.3 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12030/ 159576 | consumed samples: 1535584 | elapsed time per iteration (ms): 24280.0 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 15:28:02] PULSE: tr8-104B is running for 11:33:46 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 12040/ 159576 | consumed samples: 1540864 | elapsed time per iteration (ms): 23963.9 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12050/ 159576 | consumed samples: 1546144 | elapsed time per iteration (ms): 24135.8 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12060/ 159576 | consumed samples: 1551424 | elapsed time per iteration (ms): 24044.3 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12070/ 159576 | consumed samples: 1556704 | elapsed time per iteration (ms): 24087.4 | learning rate: 6.000E-05 | global batch size: 528 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12080/ 159576 | consumed samples: 1562064 | elapsed time per iteration (ms): 24400.0 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12090/ 159576 | consumed samples: 1567504 | elapsed time per iteration (ms): 24552.7 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12100/ 159576 | consumed samples: 1572944 | elapsed time per iteration (ms): 24886.7 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12110/ 159576 | consumed samples: 1578384 | elapsed time per iteration (ms): 24781.4 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12120/ 159576 | consumed samples: 1583824 | elapsed time per iteration (ms): 24493.1 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12130/ 159576 | consumed samples: 1589264 | elapsed time per iteration (ms): 24851.3 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12140/ 159576 | consumed samples: 1594704 | elapsed time per iteration (ms): 24746.4 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12150/ 159576 | consumed samples: 1600144 | elapsed time per iteration (ms): 24578.3 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12160/ 159576 | consumed samples: 1605584 | elapsed time per iteration (ms): 24469.2 | learning rate: 6.000E-05 | global batch size: 544 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12170/ 159576 | consumed samples: 1611152 | elapsed time per iteration (ms): 24994.1 | learning rate: 6.000E-05 | global batch size: 560 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 16:28:40] PULSE: tr8-104B is running for 12:34:24 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 12180/ 159576 | consumed samples: 1616752 | elapsed time per iteration (ms): 25275.1 | learning rate: 6.000E-05 | global batch size: 560 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12190/ 159576 | consumed samples: 1622352 | elapsed time per iteration (ms): 25176.8 | learning rate: 6.000E-05 | global batch size: 560 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12200/ 159576 | consumed samples: 1627952 | elapsed time per iteration (ms): 25167.8 | learning rate: 6.000E-05 | global batch size: 560 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12210/ 159576 | consumed samples: 1633552 | elapsed time per iteration (ms): 25057.7 | learning rate: 6.000E-05 | global batch size: 560 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12220/ 159576 | consumed samples: 1639152 | elapsed time per iteration (ms): 25147.4 | learning rate: 6.000E-05 | global batch size: 560 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12230/ 159576 | consumed samples: 1644752 | elapsed time per iteration (ms): 25198.7 | learning rate: 6.000E-05 | global batch size: 560 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12240/ 159576 | consumed samples: 1650352 | elapsed time per iteration (ms): 24894.2 | learning rate: 6.000E-05 | global batch size: 560 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12250/ 159576 | consumed samples: 1656016 | elapsed time per iteration (ms): 25306.4 | learning rate: 6.000E-05 | global batch size: 576 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12260/ 159576 | consumed samples: 1661776 | elapsed time per iteration (ms): 25946.7 | learning rate: 6.000E-05 | global batch size: 576 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12270/ 159576 | consumed samples: 1667536 | elapsed time per iteration (ms): 25714.3 | learning rate: 6.000E-05 | global batch size: 576 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12280/ 159576 | consumed samples: 1673296 | elapsed time per iteration (ms): 25863.6 | learning rate: 6.000E-05 | global batch size: 576 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12290/ 159576 | consumed samples: 1679056 | elapsed time per iteration (ms): 26038.1 | learning rate: 6.000E-05 | global batch size: 576 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12300/ 159576 | consumed samples: 1684816 | elapsed time per iteration (ms): 25611.4 | learning rate: 6.000E-05 | global batch size: 576 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12310/ 159576 | consumed samples: 1690576 | elapsed time per iteration (ms): 25819.3 | learning rate: 6.000E-05 | global batch size: 576 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 17:28:18] PULSE: tr8-104B is running for 13:34:02 since 2021-09-27T03:54:16 (1188168 on 'gpu_p13' partition (r6i5n[7-8],r6i6n0,r6i7n[7-8],r7i0n[0-5],r7i1n[7-8],r7i2n[0-1,5,8],r7i3n2,r7i5n7,r7i6n[1-4,8],r7i7n[0-4,6-8],r8i0n[0-8],r8i1n[0-4],r8i2n8,r8i3n[0-3,8],r8i4n[0-1],r8i6n[2-3,5-6],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n[0,3-8],r9i3n[0-2,6-8],r9i4n[0-6,8],r9i5n[0-8],r9i6n[0-8],r9i7n[1-8]) - iteration 12320/ 159576 | consumed samples: 1696336 | elapsed time per iteration (ms): 25983.5 | learning rate: 6.000E-05 | global batch size: 576 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12330/ 159576 | consumed samples: 1702128 | elapsed time per iteration (ms): 25674.0 | learning rate: 6.000E-05 | global batch size: 592 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 12340/ 159576 | consumed samples: 1708048 | elapsed time per iteration (ms): 26437.1 | learning rate: 6.000E-05 | global batch size: 592 | loss scale: 1.0 | grad norm: 0.000 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -Killing subprocess 76100 -Killing subprocess 76101 -Killing subprocess 76102 -Killing subprocess 76103 -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/runpy.py", line 194, in _run_module_as_main - return _run_code(code, main_globals, None, - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/runpy.py", line 87, in _run_code - exec(code, run_globals) - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/distributed/launch.py", line 340, in - main() - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/distributed/launch.py", line 326, in main - sigkill_handler(signal.SIGTERM, None) # not coming back - File "/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler - raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) -subprocess.CalledProcessError: Command '['/gpfswork/rech/six/commun/conda/tr1-13B/bin/python', '-u', '/gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/pretrain_gpt.py', '--local_rank=3', '--tensor-model-parallel-size', '4', '--pipeline-model-parallel-size', '8', '--num-layers', '32', '--hidden-size', '16384', '--ffn-hidden-size', '20480', '--num-attention-heads', '32', '--seq-length', '2048', '--max-position-embeddings', '2048', '--micro-batch-size', '1', '--rampup-batch-size', '16', '16', '6_000_000', '--global-batch-size', '2048', '--train-samples', '300_000_000', '--vocab-file', '/gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json', '--merge-file', '/gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt', '--loss-scale', '12', '--clip-grad', '1.0', '--fp16', '--checkpoint-activations', '--seed', '42', '--optimizer', 'adam', '--adam-beta1', '0.9', '--adam-beta2', '0.999', '--adam-eps', '1e-8', '--lr', '6e-5', '--min-lr', '6e-6', '--lr-decay-style', 'cosine', '--lr-decay-samples', '126_953_125', '--lr-warmup-samples', '216_320', '--clip-grad', '1.0', '--weight-decay', '1e-1', '--exit-duration-in-mins', '1190', '--log-interval', '10', '--save-interval', '1500', '--eval-interval', '1000', '--eval-iters', '5', '--codecarbon-dir', '/gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/codecarbon', '--tensorboard-dir', '/gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/tensorboard', '--tensorboard-queue-size', '5', '--log-timers-to-tensorboard', '--log-batch-size-to-tensorboard', '--log-validation-ppl-to-tensorboard', '--save', '/gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints', '--load', '/gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints', '--data-path', '/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document', '--data-impl', 'mmap', '--split', '949,50,1', '--distributed-backend', 'nccl', '--deepspeed', '--deepspeed_config', './ds_config.1188168.json', '--zero-stage', '1', '--deepspeed-activation-checkpointing']' died with . -srun: error: r6i5n7: task 0: Exited with exit code 1 -srun: Terminating job step 1188168.0 -Killing subprocess 59848 -Killing subprocess 59849 -Killing subprocess 59850 -Killing subprocess 69437 -Killing subprocess 59851 -Killing subprocess 3750 -Killing subprocess 69438 -Killing subprocess 23911 -Killing subprocess 36274 -Killing subprocess 12887 -Killing subprocess 64701 -Killing subprocess 46448 -Killing subprocess 37626 -Killing subprocess 69439 -Killing subprocess 12566 -Killing subprocess 45975 -Killing subprocess 59577 -Killing subprocess 3751 -Killing subprocess 69440 -Killing subprocess 20638 -Killing subprocess 12618 -Killing subprocess 63737 -Killing subprocess 12888 -Killing subprocess 24910 -Killing subprocess 77610 -Killing subprocess 3752 -Killing subprocess 65070 -Killing subprocess 64702 -Killing subprocess 46449 -Killing subprocess 3710 -Killing subprocess 36275 -Killing subprocess 59578 -Killing subprocess 64317 -Killing subprocess 37627 -Killing subprocess 23912 -Killing subprocess 54693 -Killing subprocess 76941 -Killing subprocess 20639 -Killing subprocess 74689 -Killing subprocess 65692 -Killing subprocess 12619 -Killing subprocess 12567 -Killing subprocess 63738 -Killing subprocess 19395 -Killing subprocess 44152 -Killing subprocess 35247 -Killing subprocess 14362 -Killing subprocess 77611 -Killing subprocess 59276 -Killing subprocess 59579 -Main process received SIGTERM, exiting -Killing subprocess 37628 -Killing subprocess 3753 -Killing subprocess 65071 -Main process received SIGTERM, exiting -Killing subprocess 23913 -Killing subprocess 54694 -Killing subprocess 64703 -Killing subprocess 12568 -Killing subprocess 63739 -Killing subprocess 46450 -Killing subprocess 45976 -Killing subprocess 3711 -Killing subprocess 38195 -Killing subprocess 36276 -Killing subprocess 12889 -Killing subprocess 24911 -Killing subprocess 10979 -Killing subprocess 77612 -Killing subprocess 59580 -Killing subprocess 18302 -Killing subprocess 63373 -Killing subprocess 64318 -Killing subprocess 37630 -Killing subprocess 65072 -Killing subprocess 52483 -Killing subprocess 23914 -Killing subprocess 54695 -Killing subprocess 68328 -Killing subprocess 76942 -Killing subprocess 20640 -Killing subprocess 74690 -Killing subprocess 65693 -Killing subprocess 64705 -Killing subprocess 12620 -Killing subprocess 12569 -Killing subprocess 63740 -Killing subprocess 46451 -Killing subprocess 45977 -Killing subprocess 55848 -Killing subprocess 3712 -Killing subprocess 19396 -Killing subprocess 44153 -Killing subprocess 35248 -Killing subprocess 47024 -Killing subprocess 33695 -Killing subprocess 36277 -Killing subprocess 12891 -Killing subprocess 63460 -Killing subprocess 14363 -Killing subprocess 57783 -Killing subprocess 24912 -Killing subprocess 10980 -Killing subprocess 77613 -Killing subprocess 59277 -Killing subprocess 69993 -Killing subprocess 53038 -Killing subprocess 18303 -Killing subprocess 63374 -Killing subprocess 64319 -Killing subprocess 8034 -Killing subprocess 62238 -Main process received SIGTERM, exiting -Killing subprocess 53475 -Killing subprocess 65073 -Killing subprocess 52484 -Killing subprocess 54696 -Killing subprocess 68329 -Killing subprocess 76943 -Killing subprocess 20641 -Killing subprocess 74691 -Killing subprocess 65694 -Killing subprocess 43049 -Killing subprocess 12621 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 45978 -Killing subprocess 55849 -Killing subprocess 3713 -Killing subprocess 39768 -Killing subprocess 19397 -Killing subprocess 44154 -Killing subprocess 35249 -Killing subprocess 47025 -Killing subprocess 71483 -Killing subprocess 33696 -Killing subprocess 38196 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 63461 -Killing subprocess 14364 -Killing subprocess 57784 -Killing subprocess 24913 -Main process received SIGTERM, exiting -Killing subprocess 59278 -Killing subprocess 70408 -Killing subprocess 69994 -Killing subprocess 2853 -Killing subprocess 53039 -Killing subprocess 18304 -Killing subprocess 52628 -Killing subprocess 63375 -Killing subprocess 64320 -Killing subprocess 77051 -Killing subprocess 41073 -Killing subprocess 8035 -Killing subprocess 3968 -Killing subprocess 23148 -Killing subprocess 67068 -Main process received SIGTERM, exiting -Killing subprocess 81189 -Killing subprocess 62239 -Killing subprocess 53476 -Killing subprocess 69086 -Killing subprocess 52485 -Main process received SIGTERM, exiting -Killing subprocess 62883 -Killing subprocess 65551 -Killing subprocess 68330 -Killing subprocess 76945 -Main process received SIGTERM, exiting -Killing subprocess 75336 -Killing subprocess 15286 -Killing subprocess 74692 -Killing subprocess 65695 -Killing subprocess 43050 -Main process received SIGTERM, exiting -Killing subprocess 66988 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 55850 -Killing subprocess 42101 -Main process received SIGTERM, exiting -Killing subprocess 8608 -Killing subprocess 39769 -Killing subprocess 19398 -Killing subprocess 44155 -Killing subprocess 15244 -Killing subprocess 50869 -Killing subprocess 35250 -Killing subprocess 47026 -Killing subprocess 71484 -Killing subprocess 35789 -Killing subprocess 56590 -Killing subprocess 33697 -Killing subprocess 38197 -Killing subprocess 21496 -Killing subprocess 63462 -Killing subprocess 81499 -Killing subprocess 14365 -Killing subprocess 57785 -Main process received SIGTERM, exiting -Killing subprocess 10981 -Killing subprocess 59279 -Killing subprocess 37333 -Main process received SIGTERM, exiting -Killing subprocess 48823 -Killing subprocess 70409 -Killing subprocess 69995 -Killing subprocess 2854 -Killing subprocess 53040 -Killing subprocess 18305 -Killing subprocess 52629 -Killing subprocess 63376 -Main process received SIGTERM, exiting -Killing subprocess 77052 -Killing subprocess 41074 -Killing subprocess 8036 -Killing subprocess 39465 -Killing subprocess 39466 -Killing subprocess 39467 -Killing subprocess 79012 -Killing subprocess 3969 -Killing subprocess 23149 -Killing subprocess 67069 -Killing subprocess 81190 -Killing subprocess 56744 -Killing subprocess 66319 -Killing subprocess 62240 -Killing subprocess 53477 -Killing subprocess 25176 -Killing subprocess 69087 -Main process received SIGTERM, exiting -Killing subprocess 52486 -Killing subprocess 23707 -Killing subprocess 62884 -Main process received SIGTERM, exiting -Killing subprocess 65552 -Killing subprocess 68331 -Killing subprocess 10802 -Main process received SIGTERM, exiting -Killing subprocess 37596 -Killing subprocess 75337 -Killing subprocess 15287 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 43051 -Killing subprocess 12337 -Killing subprocess 66989 -Killing subprocess 50840 -Killing subprocess 55851 -Killing subprocess 42102 -Killing subprocess 77529 -Killing subprocess 13528 -Killing subprocess 8609 -Killing subprocess 14216 -Killing subprocess 39770 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 15245 -Killing subprocess 50870 -Main process received SIGTERM, exiting -Killing subprocess 47027 -Killing subprocess 79944 -Killing subprocess 71485 -Killing subprocess 9027 -Killing subprocess 35790 -Killing subprocess 56591 -Killing subprocess 33699 -Killing subprocess 38198 -Killing subprocess 37572 -Killing subprocess 21497 -Killing subprocess 63463 -Killing subprocess 81500 -Main process received SIGTERM, exiting -Killing subprocess 57787 -Killing subprocess 41379 -Killing subprocess 10982 -Main process received SIGTERM, exiting -Killing subprocess 37334 -Killing subprocess 48824 -Killing subprocess 38560 -Killing subprocess 41538 -Killing subprocess 70410 -Killing subprocess 69997 -Killing subprocess 55623 -Killing subprocess 2855 -Killing subprocess 53042 -Main process received SIGTERM, exiting -Killing subprocess 52630 -Main process received SIGTERM, exiting -Killing subprocess 77053 -Killing subprocess 41075 -Killing subprocess 76949 -Killing subprocess 8037 -Killing subprocess 39468 -Main process received SIGTERM, exiting -Killing subprocess 79013 -Killing subprocess 3970 -Killing subprocess 23150 -Killing subprocess 67070 -Killing subprocess 2742 -Killing subprocess 81191 -Killing subprocess 47225 -Killing subprocess 56745 -Killing subprocess 66320 -Killing subprocess 62241 -Killing subprocess 54272 -Killing subprocess 53478 -Killing subprocess 25177 -Killing subprocess 69088 -Main process received SIGTERM, exiting -Killing subprocess 23708 -Killing subprocess 62885 -Killing subprocess 79197 -Killing subprocess 65553 -Main process received SIGTERM, exiting -Killing subprocess 10803 -Killing subprocess 37597 -Killing subprocess 75338 -Killing subprocess 15288 -Killing subprocess 43052 -Killing subprocess 12338 -Killing subprocess 14353 -Killing subprocess 66990 -Killing subprocess 50841 -Killing subprocess 75513 -Main process received SIGTERM, exiting -Killing subprocess 42103 -Killing subprocess 77530 -Killing subprocess 13529 -Killing subprocess 8610 -Killing subprocess 14217 -Killing subprocess 39772 -Killing subprocess 15246 -Killing subprocess 50871 -Killing subprocess 52998 -Killing subprocess 75590 -Main process received SIGTERM, exiting -Killing subprocess 79945 -Killing subprocess 71487 -Killing subprocess 9028 -Killing subprocess 35791 -Killing subprocess 56592 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 37573 -Killing subprocess 21498 -Main process received SIGTERM, exiting -Killing subprocess 81501 -Main process received SIGTERM, exiting -Killing subprocess 41380 -Main process received SIGTERM, exiting -Killing subprocess 37335 -Killing subprocess 48825 -Killing subprocess 38561 -Killing subprocess 41539 -Killing subprocess 70411 -Main process received SIGTERM, exiting -Killing subprocess 55624 -Killing subprocess 69208 -Killing subprocess 2856 -Main process received SIGTERM, exiting -Killing subprocess 52631 -Killing subprocess 35916 -Killing subprocess 4836 -Killing subprocess 77055 -Killing subprocess 41076 -Killing subprocess 76950 -Main process received SIGTERM, exiting -Killing subprocess 47505 -Killing subprocess 79014 -Killing subprocess 3971 -Killing subprocess 23151 -Killing subprocess 67071 -Killing subprocess 34883 -Killing subprocess 2743 -Killing subprocess 81192 -Killing subprocess 47226 -Killing subprocess 56746 -Killing subprocess 17937 -Killing subprocess 66321 -Main process received SIGTERM, exiting -Killing subprocess 54273 -Main process received SIGTERM, exiting -Killing subprocess 25178 -Killing subprocess 69089 -Killing subprocess 23709 -Killing subprocess 62886 -Killing subprocess 79198 -Killing subprocess 65554 -Killing subprocess 67154 -Killing subprocess 10804 -Killing subprocess 37598 -Killing subprocess 75339 -Killing subprocess 15289 -Main process received SIGTERM, exiting -Killing subprocess 12339 -Killing subprocess 14354 -Killing subprocess 66992 -Killing subprocess 50842 -Killing subprocess 39827 -Killing subprocess 75514 -Killing subprocess 42105 -Killing subprocess 77531 -Killing subprocess 53851 -Killing subprocess 13530 -Killing subprocess 8611 -Killing subprocess 14218 -Main process received SIGTERM, exiting -Killing subprocess 15247 -Killing subprocess 50872 -Killing subprocess 52999 -Killing subprocess 75591 -Killing subprocess 44143 -Killing subprocess 79946 -Main process received SIGTERM, exiting -Killing subprocess 9029 -Killing subprocess 35792 -Killing subprocess 56593 -Killing subprocess 37574 -Killing subprocess 57528 -Killing subprocess 21499 -Killing subprocess 81502 -Killing subprocess 41381 -Killing subprocess 37336 -Killing subprocess 48826 -Killing subprocess 16969 -Killing subprocess 38562 -Killing subprocess 41540 -Main process received SIGTERM, exiting -Killing subprocess 55625 -Killing subprocess 69209 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 35917 -Killing subprocess 4837 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 76951 -Killing subprocess 47506 -Killing subprocess 79015 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 34884 -Killing subprocess 2744 -Main process received SIGTERM, exiting -Killing subprocess 47227 -Killing subprocess 56747 -Killing subprocess 17938 -Killing subprocess 66322 -Killing subprocess 45571 -Killing subprocess 54274 -Killing subprocess 25179 -Main process received SIGTERM, exiting -Killing subprocess 23711 -Main process received SIGTERM, exiting -Killing subprocess 79199 -Main process received SIGTERM, exiting -Killing subprocess 67155 -Killing subprocess 10805 -Killing subprocess 37599 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 12340 -Killing subprocess 14355 -Main process received SIGTERM, exiting -Killing subprocess 50844 -Killing subprocess 39828 -Killing subprocess 75515 -Main process received SIGTERM, exiting -Killing subprocess 7953 -Killing subprocess 77532 -Killing subprocess 53852 -Killing subprocess 13531 -Main process received SIGTERM, exiting -Killing subprocess 14219 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 53000 -Killing subprocess 75592 -Killing subprocess 44144 -Killing subprocess 79947 -Killing subprocess 9030 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 37575 -Killing subprocess 57529 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 41383 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 16970 -Killing subprocess 38563 -Killing subprocess 41541 -Killing subprocess 55626 -Killing subprocess 69210 -Killing subprocess 35918 -Killing subprocess 4838 -Killing subprocess 76953 -Killing subprocess 47507 -Main process received SIGTERM, exiting -Killing subprocess 34885 -Killing subprocess 2745 -Killing subprocess 47228 -Main process received SIGTERM, exiting -Killing subprocess 17939 -Main process received SIGTERM, exiting -Killing subprocess 45572 -Killing subprocess 54275 -Main process received SIGTERM, exiting -Killing subprocess 34811 -Main process received SIGTERM, exiting -Killing subprocess 79200 -Killing subprocess 67156 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 14357 -Main process received SIGTERM, exiting -Killing subprocess 39829 -Killing subprocess 75516 -Killing subprocess 7954 -Main process received SIGTERM, exiting -Killing subprocess 53853 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 53002 -Killing subprocess 75593 -Killing subprocess 44145 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 57530 -Main process received SIGTERM, exiting -Killing subprocess 16971 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 69211 -Killing subprocess 35919 -Killing subprocess 4839 -Main process received SIGTERM, exiting -Killing subprocess 47509 -Killing subprocess 34886 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 17940 -Killing subprocess 45573 -Main process received SIGTERM, exiting -Killing subprocess 34812 -Main process received SIGTERM, exiting -Killing subprocess 67157 -Main process received SIGTERM, exiting -Killing subprocess 39830 -Main process received SIGTERM, exiting -Killing subprocess 7955 -Killing subprocess 53854 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 44147 -Killing subprocess 57531 -Killing subprocess 16972 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 45575 -Killing subprocess 34813 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 34814 -Main process received SIGTERM, exiting -Killing subprocess 7956 -Main process received SIGTERM, exiting -Killing subprocess 42690 -Killing subprocess 42691 -Killing subprocess 42692 -Killing subprocess 42693 -Main process received SIGTERM, exiting -Killing subprocess 7083 -Killing subprocess 7084 -Killing subprocess 7085 -Killing subprocess 22811 -Killing subprocess 7086 -Killing subprocess 22812 -Main process received SIGTERM, exiting -Killing subprocess 22813 -Killing subprocess 22814 -Main process received SIGTERM, exiting -Killing subprocess 13431 -Killing subprocess 13432 -Killing subprocess 13433 -Killing subprocess 13434 -Main process received SIGTERM, exiting -Killing subprocess 72295 -Killing subprocess 72296 -Killing subprocess 72297 -Killing subprocess 15401 -Killing subprocess 72298 -Killing subprocess 15402 -Killing subprocess 15403 -Killing subprocess 15405 -Killing subprocess 52149 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 52150 -Killing subprocess 52151 -Killing subprocess 52152 -Main process received SIGTERM, exiting -Killing subprocess 38674 -Killing subprocess 38675 -Killing subprocess 33953 -Killing subprocess 33954 -Killing subprocess 38676 -Killing subprocess 38677 -Killing subprocess 33955 -Main process received SIGTERM, exiting -Killing subprocess 33957 -Main process received SIGTERM, exiting -Killing subprocess 65236 -Killing subprocess 65237 -Killing subprocess 65238 -Killing subprocess 65239 -Main process received SIGTERM, exiting -srun: error: r8i1n2: task 43: Exited with exit code 1 -srun: error: r9i5n7: task 109: Exited with exit code 1 -srun: error: r9i6n0: task 111: Exited with exit code 1 -srun: error: r9i0n1: task 65: Exited with exit code 1 -srun: error: r9i0n3: task 67: Exited with exit code 1 -srun: error: r8i0n2: task 34: Exited with exit code 1 -srun: error: r7i6n2: task 20: Exited with exit code 1 -srun: error: r9i2n8: task 87: Exited with exit code 1 -srun: error: r8i0n8: task 40: Exited with exit code 1 -srun: error: r9i3n1: task 89: Exited with exit code 1 -srun: error: r9i4n1: task 95: Exited with exit code 1 -srun: error: r9i3n0: task 88: Exited with exit code 1 -srun: error: r6i6n0: task 2: Exited with exit code 1 -srun: error: r8i3n2: task 49: Exited with exit code 1 -srun: error: r8i0n7: task 39: Exited with exit code 1 -srun: error: r9i6n7: task 118: Exited with exit code 1 -srun: error: r8i7n6: task 61: Exited with exit code 1 -srun: error: r8i7n4: task 59: Exited with exit code 1 -srun: error: r9i5n4: task 106: Exited with exit code 1 -srun: error: r8i0n6: task 38: Exited with exit code 1 -srun: error: r9i5n8: task 110: Exited with exit code 1 -srun: error: r9i0n4: task 68: Exited with exit code 1 -srun: error: r9i4n3: task 97: Exited with exit code 1 -srun: error: r8i1n0: task 41: Exited with exit code 1 -srun: error: r7i7n7: task 30: Exited with exit code 1 -srun: error: r9i2n3: task 82: Exited with exit code 1 -srun: error: r9i6n8: task 119: Exited with exit code 1 -srun: error: r8i7n7: task 62: Exited with exit code 1 -srun: error: r9i5n5: task 107: Exited with exit code 1 -srun: error: r9i2n6: task 85: Exited with exit code 1 -srun: error: r7i6n4: task 22: Exited with exit code 1 -srun: error: r9i1n2: task 74: Exited with exit code 1 -srun: error: r9i0n0: task 64: Exited with exit code 1 -srun: error: r9i0n5: task 69: Exited with exit code 1 -srun: error: r8i2n8: task 46: Exited with exit code 1 -srun: error: r9i4n2: task 96: Exited with exit code 1 -srun: error: r7i3n2: task 17: Exited with exit code 1 -srun: error: r9i3n7: task 92: Exited with exit code 1 -srun: error: r9i0n2: task 66: Exited with exit code 1 -srun: error: r9i1n3: task 75: Exited with exit code 1 -srun: error: r8i1n4: task 45: Exited with exit code 1 -srun: error: r8i7n5: task 60: Exited with exit code 1 -srun: error: r9i2n5: task 84: Exited with exit code 1 -srun: error: r7i7n8: task 31: Exited with exit code 1 -srun: error: r8i0n5: task 37: Exited with exit code 1 -srun: error: r8i7n3: task 58: Exited with exit code 1 -srun: error: r7i6n3: task 21: Exited with exit code 1 -srun: error: r9i1n1: task 73: Exited with exit code 1 -srun: error: r9i3n8: task 93: Exited with exit code 1 -srun: error: r8i7n8: task 63: Exited with exit code 1 -srun: error: r8i3n0: task 47: Exited with exit code 1 -srun: error: r8i0n3: task 35: Exited with exit code 1 -srun: error: r9i4n0: task 94: Exited with exit code 1 -srun: error: r9i5n3: task 105: Exited with exit code 1 -srun: error: r8i1n3: task 44: Exited with exit code 1 -srun: error: r8i6n6: task 57: Exited with exit code 1 -srun: error: r8i0n0: task 32: Exited with exit code 1 -srun: error: r9i5n6: task 108: Exited with exit code 1 -srun: error: r9i2n4: task 83: Exited with exit code 1 -srun: error: r8i3n1: task 48: Exited with exit code 1 -srun: error: r7i2n5: task 15: Exited with exit code 1 -srun: error: r9i1n0: task 72: Exited with exit code 1 -srun: error: r7i5n7: task 18: Exited with exit code 1 -srun: error: r6i5n8: task 1: Exited with exit code 1 -srun: error: r8i3n8: task 51: Exited with exit code 1 -srun: error: r8i0n4: task 36: Exited with exit code 1 -srun: error: r8i0n1: task 33: Exited with exit code 1 -srun: error: r7i7n2: task 26: Exited with exit code 1 -srun: error: r8i3n3: task 50: Exited with exit code 1 -srun: error: r7i7n6: task 29: Exited with exit code 1 -srun: error: r7i6n1: task 19: Exited with exit code 1 -srun: error: r7i6n8: task 23: Exited with exit code 1 -srun: error: r9i2n0: task 81: Exited with exit code 1 -srun: error: r9i4n6: task 100: Exited with exit code 1 -srun: error: r8i6n2: task 54: Exited with exit code 1 -srun: error: r9i3n2: task 90: Exited with exit code 1 -srun: error: r8i6n3: task 55: Exited with exit code 1 -srun: error: r7i7n0: task 24: Exited with exit code 1 -srun: error: r8i4n0: task 52: Exited with exit code 1 -srun: error: r9i1n8: task 80: Exited with exit code 1 -srun: error: r8i4n1: task 53: Exited with exit code 1 -srun: error: r8i1n1: task 42: Exited with exit code 1 -srun: error: r9i5n2: task 104: Exited with exit code 1 -srun: error: r9i0n8: task 71: Exited with exit code 1 -srun: error: r9i5n1: task 103: Exited with exit code 1 -srun: error: r7i7n1: task 25: Exited with exit code 1 -srun: error: r9i4n4: task 98: Exited with exit code 1 -srun: error: r7i7n4: task 28: Exited with exit code 1 -srun: error: r9i0n6: task 70: Exited with exit code 1 -srun: error: r9i1n7: task 79: Exited with exit code 1 -srun: error: r9i2n7: task 86: Exited with exit code 1 -srun: error: r9i1n6: task 78: Exited with exit code 1 -srun: error: r9i5n0: task 102: Exited with exit code 1 -srun: error: r9i3n6: task 91: Exited with exit code 1 -srun: error: r9i1n5: task 77: Exited with exit code 1 -srun: error: r7i2n8: task 16: Exited with exit code 1 -srun: error: r9i4n8: task 101: Exited with exit code 1 -srun: error: r9i4n5: task 99: Exited with exit code 1 -srun: error: r7i2n1: task 14: Exited with exit code 1 -srun: error: r7i0n0: task 5: Exited with exit code 1 -srun: error: r9i6n6: task 117: Exited with exit code 1 -srun: error: r9i7n6: task 125: Exited with exit code 1 -srun: error: r9i7n4: task 123: Exited with exit code 1 -srun: error: r6i7n8: task 4: Exited with exit code 1 -srun: error: r9i6n2: task 113: Exited with exit code 1 -srun: error: r9i6n3: task 114: Exited with exit code 1 -srun: error: r6i7n7: task 3: Exited with exit code 1 -srun: error: r7i0n5: task 10: Exited with exit code 1 -srun: error: r9i7n5: task 124: Exited with exit code 1 -srun: error: r7i1n8: task 12: Exited with exit code 1 -srun: error: r9i7n7: task 126: Exited with exit code 1 -srun: error: r7i0n2: task 7: Exited with exit code 1 -srun: error: r7i0n3: task 8: Exited with exit code 1 -srun: error: r9i7n8: task 127: Exited with exit code 1 -srun: error: r7i2n0: task 13: Exited with exit code 1 -srun: error: r7i1n7: task 11: Exited with exit code 1 -srun: error: r9i6n1: task 112: Exited with exit code 1 -srun: error: r7i0n4: task 9: Exited with exit code 1 -srun: error: r9i1n4: task 76: Exited with exit code 1 -srun: error: r9i7n2: task 121: Exited with exit code 1 -srun: error: r9i6n4: task 115: Exited with exit code 1 -srun: error: r7i0n1: task 6: Exited with exit code 1 -srun: error: r9i7n3: task 122: Exited with exit code 1 -srun: error: r8i6n5: task 56: Exited with exit code 1 -srun: error: r9i6n5: task 116: Exited with exit code 1 -srun: error: r7i7n3: task 27: Exited with exit code 1 -srun: error: r9i7n1: task 120: Exited with exit code 1 -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -Killing subprocess 32020 -Killing subprocess 32021 -Killing subprocess 32022 -Killing subprocess 32023 -Main process received SIGTERM, exiting -Killing subprocess 2391 -Killing subprocess 2392 -Killing subprocess 2393 -Killing subprocess 2395 -Main process received SIGTERM, exiting -Killing subprocess 8155 -Killing subprocess 8156 -Killing subprocess 8157 -Killing subprocess 8158 -Main process received SIGTERM, exiting -Killing subprocess 1105 -Killing subprocess 1106 -Killing subprocess 1107 -Killing subprocess 1108 -Main process received SIGTERM, exiting -Killing subprocess 61308 -Killing subprocess 70292 -Killing subprocess 42836 -Killing subprocess 70293 -Killing subprocess 70294 -Killing subprocess 30001 -Killing subprocess 61309 -Killing subprocess 61310 -Killing subprocess 61312 -Killing subprocess 42837 -Killing subprocess 57225 -Killing subprocess 70296 -Killing subprocess 30002 -Killing subprocess 30003 -Killing subprocess 13020 -Main process received SIGTERM, exiting -Killing subprocess 40485 -Killing subprocess 72254 -Killing subprocess 42838 -Killing subprocess 42840 -Main process received SIGTERM, exiting -Killing subprocess 57226 -Killing subprocess 57227 -Main process received SIGTERM, exiting -Killing subprocess 76054 -Killing subprocess 30004 -Main process received SIGTERM, exiting -Killing subprocess 13021 -Killing subprocess 40486 -Killing subprocess 14664 -Killing subprocess 72255 -Killing subprocess 57228 -Main process received SIGTERM, exiting -Killing subprocess 16769 -Killing subprocess 76055 -Killing subprocess 76056 -Killing subprocess 13022 -Killing subprocess 13023 -Main process received SIGTERM, exiting -Killing subprocess 40487 -Killing subprocess 40488 -Main process received SIGTERM, exiting -Killing subprocess 14665 -Killing subprocess 14666 -Killing subprocess 14668 -Killing subprocess 72256 -Killing subprocess 72258 -Main process received SIGTERM, exiting -Killing subprocess 16770 -Killing subprocess 16771 -Killing subprocess 76057 -Main process received SIGTERM, exiting -Killing subprocess 60803 -Main process received SIGTERM, exiting -Killing subprocess 66379 -Killing subprocess 16772 -Main process received SIGTERM, exiting -Killing subprocess 60804 -Killing subprocess 66380 -Killing subprocess 66381 -Killing subprocess 13204 -Killing subprocess 60805 -Killing subprocess 60806 -Main process received SIGTERM, exiting -Killing subprocess 66382 -Main process received SIGTERM, exiting -Killing subprocess 13205 -Killing subprocess 13206 -Killing subprocess 13207 -Killing subprocess 33516 -Killing subprocess 33006 -Main process received SIGTERM, exiting -Killing subprocess 33517 -Killing subprocess 33518 -Killing subprocess 33520 -Killing subprocess 33007 -Killing subprocess 33008 -Killing subprocess 33009 -Killing subprocess 72301 -Killing subprocess 16814 -Main process received SIGTERM, exiting -Killing subprocess 59087 -Killing subprocess 74735 -Killing subprocess 13261 -Main process received SIGTERM, exiting -Killing subprocess 55620 -Killing subprocess 72302 -Killing subprocess 16815 -Killing subprocess 59088 -Killing subprocess 74736 -Killing subprocess 74737 -Killing subprocess 74738 -Main process received SIGTERM, exiting -Killing subprocess 13262 -Killing subprocess 55621 -Killing subprocess 55622 -Killing subprocess 72303 -Killing subprocess 72304 -Main process received SIGTERM, exiting -slurmstepd: error: *** STEP 1271130.0 ON r7i6n1 CANCELLED AT 2021-09-27T17:43:09 *** -Killing subprocess 5069 -Killing subprocess 16816 -Killing subprocess 16817 -Main process received SIGTERM, exiting -Killing subprocess 59089 -Killing subprocess 59090 -Main process received SIGTERM, exiting -Killing subprocess 36826 -Killing subprocess 13263 -Killing subprocess 13264 -Main process received SIGTERM, exiting -Killing subprocess 55623 -Main process received SIGTERM, exiting -Killing subprocess 72745 -Killing subprocess 5070 -Killing subprocess 5071 -Killing subprocess 36827 -Killing subprocess 36828 -Killing subprocess 22929 -Killing subprocess 5072 -Main process received SIGTERM, exiting -Killing subprocess 23020 -Killing subprocess 39440 -Killing subprocess 36829 -Main process received SIGTERM, exiting -Killing subprocess 72746 -Killing subprocess 23021 -Killing subprocess 39441 -Killing subprocess 22930 -Killing subprocess 22931 -Killing subprocess 60544 -Killing subprocess 72747 -Killing subprocess 72748 -Main process received SIGTERM, exiting -Killing subprocess 23022 -Killing subprocess 23023 -Main process received SIGTERM, exiting -Killing subprocess 39442 -Killing subprocess 39443 -Main process received SIGTERM, exiting -Killing subprocess 4007 -Killing subprocess 22932 -Main process received SIGTERM, exiting -Killing subprocess 60545 -Killing subprocess 38454 -Killing subprocess 31565 -Killing subprocess 62249 -Killing subprocess 4008 -Killing subprocess 4009 -Killing subprocess 60546 -Killing subprocess 60547 -Main process received SIGTERM, exiting -Killing subprocess 38455 -Killing subprocess 38456 -Killing subprocess 65136 -Killing subprocess 31566 -Killing subprocess 31567 -Killing subprocess 31568 -Main process received SIGTERM, exiting -Killing subprocess 14739 -Killing subprocess 62250 -Killing subprocess 62251 -Killing subprocess 31604 -Killing subprocess 4010 -Main process received SIGTERM, exiting -Killing subprocess 38457 -Main process received SIGTERM, exiting -Killing subprocess 65137 -Killing subprocess 14740 -Killing subprocess 14741 -Killing subprocess 62252 -Main process received SIGTERM, exiting -Killing subprocess 31605 -Killing subprocess 65138 -Killing subprocess 65139 -Main process received SIGTERM, exiting -Killing subprocess 14743 -Main process received SIGTERM, exiting -Killing subprocess 31606 -Killing subprocess 31607 -Main process received SIGTERM, exiting -Killing subprocess 3548 -Killing subprocess 54160 -Killing subprocess 3549 -Killing subprocess 3550 -Killing subprocess 54161 -Killing subprocess 54162 -Killing subprocess 54164 -Main process received SIGTERM, exiting -Killing subprocess 33462 -Killing subprocess 37254 -Killing subprocess 62641 -Killing subprocess 3552 -Main process received SIGTERM, exiting -Killing subprocess 33463 -Killing subprocess 33464 -Killing subprocess 78252 -Killing subprocess 37255 -Killing subprocess 37256 -Killing subprocess 62642 -Killing subprocess 62643 -Killing subprocess 62644 -Killing subprocess 33465 -Main process received SIGTERM, exiting -Killing subprocess 78253 -Killing subprocess 78254 -Killing subprocess 78255 -Killing subprocess 37257 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 71588 -Killing subprocess 52835 -Killing subprocess 66284 -Main process received SIGTERM, exiting -Killing subprocess 71589 -Killing subprocess 52836 -Killing subprocess 66285 -Killing subprocess 71590 -Killing subprocess 71591 -Main process received SIGTERM, exiting -Killing subprocess 73370 -Killing subprocess 52837 -Killing subprocess 52838 -Main process received SIGTERM, exiting -Killing subprocess 66286 -Killing subprocess 66287 -Main process received SIGTERM, exiting -Killing subprocess 70128 -Killing subprocess 73371 -Killing subprocess 73372 -Killing subprocess 76744 -Killing subprocess 73373 -Main process received SIGTERM, exiting -Killing subprocess 70129 -Killing subprocess 70130 -Killing subprocess 70132 -Main process received SIGTERM, exiting -Killing subprocess 76745 -Killing subprocess 76746 -Killing subprocess 76748 -Main process received SIGTERM, exiting -Killing subprocess 42114 -Killing subprocess 42115 -Killing subprocess 42116 -Killing subprocess 42117 -Main process received SIGTERM, exiting -Killing subprocess 22439 -Killing subprocess 22440 -Killing subprocess 22441 -Killing subprocess 22442 -Killing subprocess 6741 -Main process received SIGTERM, exiting -Killing subprocess 6742 -Killing subprocess 6743 -Killing subprocess 6744 -Killing subprocess 27342 -Main process received SIGTERM, exiting -Killing subprocess 4903 -Killing subprocess 27343 -Killing subprocess 7749 -Killing subprocess 4904 -Killing subprocess 4905 -Killing subprocess 4906 -Main process received SIGTERM, exiting -Killing subprocess 27344 -Killing subprocess 27345 -Main process received SIGTERM, exiting -Killing subprocess 7750 -Killing subprocess 7751 -Killing subprocess 7753 -Main process received SIGTERM, exiting -Killing subprocess 78894 -Killing subprocess 78895 -Killing subprocess 78896 -Killing subprocess 78897 -Main process received SIGTERM, exiting -Killing subprocess 24072 -Killing subprocess 7177 -Killing subprocess 24073 -Killing subprocess 7178 -Killing subprocess 24074 -Killing subprocess 24075 -Main process received SIGTERM, exiting -Killing subprocess 7179 -Killing subprocess 7180 -Main process received SIGTERM, exiting -Killing subprocess 78710 -Killing subprocess 78711 -Killing subprocess 78712 -Killing subprocess 78713 -Main process received SIGTERM, exiting -Killing subprocess 66743 -Killing subprocess 66744 -Killing subprocess 66745 -Killing subprocess 66751 -Main process received SIGTERM, exiting -Killing subprocess 65099 -Killing subprocess 65100 -Killing subprocess 65101 -Killing subprocess 65103 -Main process received SIGTERM, exiting -srun: Job step aborted: Waiting up to 62 seconds for job step to finish. -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninja ninja.................. [OKAY].................. - [OKAY] --------------------------------------------------- --------------------------------------------------- -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam [YES] ..................... [YES][OKAY] -...... [OKAY] -fused_adam fused_adam............. .............[NO] .......[NO] [OKAY] -....... [OKAY] -fused_lamb fused_lamb............. [NO]............. .......[NO] [OKAY] -....... [OKAY] -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY] -[OKAY] -transformer ............transformer [NO]............ .......[NO] [OKAY] -....... [OKAY] -stochastic_transformer stochastic_transformer. [NO] ........ [OKAY][NO] - ....... [OKAY] -ninjaninja .................. ..................[OKAY] -[OKAY]-------------------------------------------------- - ---------------------------------------------------op name - ................op name installed................ ..installed compatible.. - compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -transformertransformer ........................ [NO][NO] .............. [OKAY] -[OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... ...............[YES] [YES]...... ......[OKAY] -[OKAY] -fused_adam .............fused_adam [NO]............. .......[NO] [OKAY]....... - [OKAY] -fused_lamb fused_lamb............. .............[NO] [NO]....... .......[OKAY] -[OKAY] -sparse_attnsparse_attn ........................ [NO][NO] .............. [OKAY][OKAY] - -transformer transformer............ ............[NO] [NO]....... .......[OKAY] -[OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................................................................ installedinstalledinstalled installed .. .... .. compatible compatiblecompatible - -compatible --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -cpu_adamcpu_adam cpu_adamcpu_adam.............................. ..............................[YES][YES] [YES][YES]............ [OKAY]............[OKAY] - -[OKAY][OKAY] - -fused_adam fused_adam.............fused_adamfused_adam .............[NO].......................... [NO].......[NO][NO] .......[OKAY].............. - [OKAY][OKAY][OKAY] - - -fused_lamb ............. fused_lambfused_lamb[NO] fused_lamb ............. ................................. [NO][OKAY][NO][NO] - ..................... [OKAY][OKAY][OKAY] - - -sparse_attn ............ [NO] .......sparse_attnsparse_attn [OKAY]sparse_attn........................ - [NO][NO]............ transformer .............. [NO] ............ [OKAY] [OKAY]....... -[NO] - [OKAY].......transformer - transformer [OKAY] ............ -............ [NO]transformer[NO] .......stochastic_transformer................... [OKAY][NO][OKAY] -. - .......[NO] [OKAY]stochastic_transformer -.......stochastic_transformer [OKAY]. -. stochastic_transformer [NO] [NO] ............... [OKAY][NO][OKAY] - -....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ sparse_attninstalled .............. [NO]compatible -.......-------------------------------------------------- -[OKAY] -transformer ............ [NO] cpu_adam....... ...............[OKAY] -[YES] ...... stochastic_transformer[OKAY] -. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninjaJIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -op name - op nameop nameop name................ ................ ................ ................installedinstalled installedinstalled.... .. ..compatiblecompatible - -compatiblecompatible---------------------------------------------------------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam [YES]...............cpu_adamcpu_adam ....................................[YES] [YES] [OKAY] - ...... ...... [YES] [OKAY] fused_adam[OKAY] - -................... [OKAY][NO] - ....... [OKAY] -fused_lamb fused_adam............. fused_adam .............fused_adam [NO] ............. [NO] .................... [NO] [OKAY]....... - [NO].......[OKAY] [OKAY] -....... - [OKAY]fused_lamb -fused_lamb ............. .............fused_lambsparse_attn[NO] [NO]................................ [NO][NO].......[OKAY] -[OKAY].............. - [OKAY][OKAY] - -transformer ............ [NO] ....... [OKAY] -sparse_attn sparse_attn............stochastic_transformer sparse_attn [NO] ......................... .......[NO] [NO] [OKAY][NO] ....... - ....... ....... [OKAY]transformer[OKAY] - -[OKAY]............ - transformertransformer[NO] ............................... [NO][NO][OKAY] - .............. [OKAY][OKAY]stochastic_transformer - - . stochastic_transformer[NO]stochastic_transformer ....... .[OKAY] . -[NO] [NO]....... .......[OKAY] -[OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaJIT compiled ops requires ninja - - - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY][OKAY] - - -[OKAY]------------------------------------------------------------------------------------------------------------------------------------------------------ - - - -op name-------------------------------------------------- op name................op name - ................op name installed................ installed .................. installed compatibleinstalled -.. --------------------------------------------------....compatible - - compatible--------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES] ......cpu_adam cpu_adam[OKAY]...............cpu_adam - ...............[YES]............... [YES] ...... ...... [YES] [OKAY]fused_adam - [OKAY] ...... -............. [OKAY][NO] -.......fused_adam [OKAY]............. - [NO]fused_adam fused_lamb....... fused_adam............. .............[OKAY]............. [NO] - [NO] [NO]..............fused_lamb .......[OKAY][OKAY] ............. - - [OKAY][NO] - fused_lamb....... .............fused_lamb[OKAY] ............. -[NO] sparse_attn [NO] ....... ............ .......[OKAY] [NO] -[OKAY] -.......sparse_attn [OKAY]............ - [NO] .......transformer [OKAY]............ - [NO]sparse_attnsparse_attn transformer....... ............ ........................ [OKAY] [NO] -[NO][NO] .......stochastic_transformer.............. [OKAY] [OKAY] -[OKAY] -. - [NO]transformerstochastic_transformer transformer....... ............[OKAY] .............[NO] - [NO][NO]....... ..............[OKAY] -[OKAY][OKAY] - -stochastic_transformer stochastic_transformer. [NO] ........ [NO][OKAY] -....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. .. [NO] - ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizerasync_io ............................. [NO][NO] .............. [OKAY][NO] - --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja -JIT compiled ops requires ninja --------------------------------------------------- - -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................. .................. ..................[OKAY][OKAY] - - [OKAY][OKAY]-------------------------------------------------- - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - - op name................op nameop name ................................installed................ installed..installedinstalled compatible .... -.. --------------------------------------------------compatible -compatible -compatible - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam cpu_adam...............cpu_adam [YES]cpu_adam.............................. [YES]......[YES] ............... ............[YES] [OKAY] -[OKAY][OKAY]...... - - [OKAY] -fused_adamfused_adam fused_adam .............fused_adam ............. ............. [NO]............. [NO][NO] ....... [NO] .............. [OKAY]....... [OKAY] -[OKAY] - -[OKAY]fused_lamb -fused_lambfused_lamb ............. fused_lamb ..........................[NO] ............. [NO] [NO]....... [NO] ....... ....... [OKAY].......[OKAY] - [OKAY] -[OKAY] - -sparse_attnsparse_attn sparse_attn ............sparse_attn ............[NO] ............ ................... [NO] [OKAY][NO] [NO] - ....... ....... ....... [OKAY]transformer[OKAY] - -[OKAY]............ -transformer[NO]transformer ...............................transformer [OKAY][NO]............ - [NO] .......[NO] .......stochastic_transformer....... [OKAY] [OKAY] - -[OKAY]. - [NO]stochastic_transformerstochastic_transformer stochastic_transformer....... . . [OKAY] [NO]. - [NO] ....... [NO].......[OKAY] -.......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop nameop name ................ ................................ ................installedinstalledinstalled ..installed.. .. ..compatiblecompatible - -compatiblecompatible-------------------------------------------------- --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adamcpu_adam cpu_adam [YES].............................. ......[YES]...............[YES] ............[YES] [OKAY]......[OKAY] - -[OKAY] -[OKAY] -fused_adam fused_adam............. fused_adam ............. fused_adam .............[NO][NO] .......[NO] ....... [OKAY] - ....................[OKAY]fused_lamb - [OKAY] ............. -[NO]fused_lamb fused_lamb....................[NO] [NO] .................... .......[OKAY] [NO][OKAY] -[OKAY] - -....... [OKAY] -fused_lamb ............. [NO]sparse_attn .......sparse_attnsparse_attn............ ............ ............ [NO] [NO] [NO]....... [OKAY] ....... -[OKAY]....... - [OKAY][OKAY]transformer - - ............ transformertransformer[NO] ............ ................... [NO][NO][OKAY] - .......sparse_attn....... stochastic_transformer[OKAY] -............[OKAY]. - [NO]stochastic_transformer .......stochastic_transformer[NO]. [OKAY] [NO] -. ..............[NO] [OKAY][OKAY]....... - [OKAY] - -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. .................. [OKAY][OKAY] [OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -op nameop nameop name op name ................ ................................................ installed installed installedinstalled .. .. .. ..compatible compatible -compatible -compatible --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -cpu_adam ...............cpu_adam cpu_adam [YES] ...............cpu_adam............... .....................[YES][YES] [YES] [OKAY] ...... -............ [OKAY][OKAY][OKAY] - - -fused_adam ............. [NO] .......fused_adamfused_adam fused_adam[OKAY]............. - ..........................[NO] [NO]....... fused_lamb[NO] ....... .............[OKAY] ....... [OKAY] - [NO] -[OKAY] -fused_lamb....... fused_lamb .............fused_lamb [OKAY] .......................... - [NO] [NO] [NO] ....... .............. [OKAY][OKAY][OKAY] - - -sparse_attn ............ [NO] ....... [OKAY] -transformersparse_attn sparse_attn sparse_attn ........................ ............ ............[NO] [NO][NO] [NO].............. .......[OKAY] [OKAY]....... -[OKAY] - -[OKAY]transformer - stochastic_transformertransformer............ transformer ............ [NO] ............. [NO] .......[NO] [NO][OKAY]....... ....... - ....... [OKAY] -[OKAY][OKAY]stochastic_transformer - - stochastic_transformer.stochastic_transformer [NO].. .......[NO][NO] .......[OKAY]....... -[OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ------------------------------------------------------------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at -transformer_inference .. [NO] ....... [OKAY] - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................. .................. .................................... [OKAY] [OKAY][OKAY] -[OKAY] --------------------------------------------------- - - -------------------------------------------------------------------------------------------------------------------------------------------------------op name - - - op nameop name................ op name ................ installed................ ................ installed ..installed installed.. compatible....compatible - -compatible--------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adam cpu_adam[YES] ............... ............... ..................... [YES] [YES] [OKAY] -[YES] ...... ...... ...... [OKAY] [OKAY] - -[OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - -fused_adam ............. [NO] ....... fused_adam[OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name -ninjaninjaninjaninja .................................... .................................... [OKAY] [OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -fused_adam fused_adam ............. .............fused_lamb[NO]............. .......[NO].............[NO] [OKAY] .......[NO] -....... .......[OKAY][OKAY]fused_lamb - -op name ................op nameop name................ installed ................installed.................. ..compatibleinstalledinstalled compatible - - - - - [OKAY].............fused_lamb -..--------------------------------------------------..-------------------------------------------------- - - compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -op nameop nameop name op name................................ installedinstalled................ ................ .... installed installed compatiblecompatible.. -fused_lamb [NO]............. ............. ....... [NO] [NO][OKAY] -.............. [OKAY][OKAY]sparse_attn -cpu_adam ...............cpu_adam [YES]............... cpu_adam ......cpu_adam [YES] ...............[OKAY] ............... -..-------------------------------------------------- - -------------------------------------------------- -compatiblecompatible - - ----------------------------------------------------------------------------------------------------- - - - ............ [NO] ....... [OKAY] -......[YES] [YES]......[OKAY] -......[OKAY] -cpu_adam cpu_adam............... ...............cpu_adamcpu_adam[YES] ...............[YES]..................... ...... [OKAY][YES] -sparse_attn ............transformer sparse_attn [NO]sparse_attn ............ ...............................[NO] [OKAY] [NO].......[NO] -fused_adam[OKAY] -............. [NO] ....... [OKAY] -[YES][OKAY] - .......transformer[OKAY] -....... [OKAY] ............ -fused_adam fused_adam.............fused_lamb fused_adam [NO].......................... ....................[NO][NO] [NO][OKAY].............. - .......[OKAY][OKAY] -fused_lamb -[OKAY] -............ [OKAY][OKAY] - - stochastic_transformer[OKAY][NO] -.............fused_lamb [NO]fused_lamb............. ....... ............. sparse_attn[NO] [OKAY] [NO] - ................... ....... [NO] [OKAY] [OKAY] -....... -fused_adam ............. [NO]fused_adam .................... fused_adam fused_adam[OKAY] [NO] -transformer .transformer....... ............ [NO]............ [OKAY] .......[NO] -[NO] [OKAY]....... - [OKAY] -............. ............. ....... [NO]fused_lamb [NO] [OKAY] ............. -....... ....... [NO] [OKAY] [OKAY] -fused_lamb....... -.......stochastic_transformer [OKAY][OKAY] - -. [NO] .......stochastic_transformer stochastic_transformer [OKAY] -sparse_attn transformer............ ............[NO]sparse_attnsparse_attn [NO]................... ............ [OKAY] -[NO].......[NO] .......[OKAY]transformer ....... - .............[OKAY] -fused_lamb fused_lamb [NO] ............. .................... [NO][NO][OKAY] -.. [NO][NO] .............. [OKAY][OKAY] - - [OKAY] ............ - [OKAY]stochastic_transformer[NO] -.............. sparse_attn[OKAY][OKAY] - - transformer........ transformer[OKAY][NO]............ -............ [NO] ....... [OKAY] - ...................[NO] stochastic_transformer.......[OKAY][NO] -sparse_attn ............transformer [NO]sparse_attnsparse_attn............ ....... ........................[OKAY][NO] ....... -[NO][NO] transformer[OKAY]....... ....... - ........ [OKAY] [NO] [OKAY] -....... - [OKAY] -ninjaninjaninjaninja .................. .................. .................. ..................[OKAY] [OKAY] [OKAY] - -[OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - ............[OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - - - ---------------------------------------------------op name -op nameop name op name ................................................ ................installedinstalledinstalled ..installed ....compatible -..compatiblecompatible-------------------------------------------------- - -stochastic_transformer[NO] .......transformer. transformer[OKAY] -compatible - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... [YES] cpu_adam...... cpu_adam...............[OKAY]cpu_adam - ...............[YES]............... [YES]......[YES] ......[OKAY]...... - [OKAY][OKAY]fused_adam - ............[NO]............ .......[NO] stochastic_transformer[NO] [OKAY]....... - . ....... [OKAY][NO] -[OKAY]....... - [OKAY] - - ............. [NO] ....... [OKAY] -stochastic_transformer stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] -fused_adam fused_lamb.............fused_adam fused_adam[NO].......................... .............[NO].......[NO] [NO] .....................[OKAY] - [OKAY][OKAY][OKAY] - - -fused_lamb ............. [NO]fused_lamb fused_lamb ................................. [OKAY][NO][NO]sparse_attn - .......................... [OKAY][OKAY][NO] - - ....... [OKAY] -transformersparse_attn ........................ [NO][NO] .......sparse_attn.......sparse_attn [OKAY] ............[OKAY] -............ - [NO]stochastic_transformer [NO]transformer....... ....................[OKAY] -[OKAY][NO][NO] - ..............transformer transformer [OKAY][OKAY] ............ -............ - [NO][NO] stochastic_transformer .............. .[OKAY][OKAY] - -[NO] ....... stochastic_transformer[OKAY] -stochastic_transformer. [NO]. .......[NO] [OKAY] -....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] [OKAY] - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -op nameop nameop name op name ................ ................................ ................installedinstalled ..installed..installed compatible.. - compatible..compatible-------------------------------------------------- - - ---------------------------------------------------compatible-------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adamcpu_adam cpu_adam...... ............... ............... [OKAY]............... -[YES] [YES][YES]...... ............ [OKAY] [OKAY]fused_adam -[OKAY] - -............. [NO] ....... [OKAY] -fused_lambfused_adam fused_adam fused_adam............. ............. ..........................[NO] [NO] [NO]..............[NO] [OKAY][OKAY].............. - - [OKAY][OKAY] - -fused_lamb ............. [NO]fused_lamb .......fused_lamb [OKAY] ............. -sparse_attn............. [NO]............[NO] .......[NO]....... .......[OKAY][OKAY] - -[OKAY]sparse_attn - ............ transformer[NO] ................... [NO][OKAY] -....... transformer[OKAY] -sparse_attnsparse_attn ............ ............ ............stochastic_transformer [NO] [NO][NO]....... . .............. [NO] [OKAY][OKAY][OKAY] - -....... - [OKAY] -stochastic_transformertransformertransformer ......................... [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] - - -stochastic_transformer stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] -ninjaninjaninjaninja .................................... .................................... [OKAY] [OKAY] -[OKAY][OKAY]-------------------------------------------------- - - - --------------------------------------------------- -----------------------------------------------------------------------------------------------------op name -op name - op name................ ................op name installed................ installed .................. installed ..compatible installed - compatible..-------------------------------------------------- -.. --------------------------------------------------- compatible -compatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] ..................... cpu_adamcpu_adam[YES][OKAY] - .................................... [YES][OKAY][YES] - ............ [OKAY][OKAY] -fused_adam - ............. [NO] ....... fused_adam[OKAY] -............. [NO] fused_adamfused_adam.......fused_lamb ..........................[OKAY]............. - [NO][NO] [NO]fused_lamb ....... ........................... [OKAY] [OKAY][NO] - -[OKAY] -.......fused_lamb [OKAY]fused_lamb -............. .............[NO] [NO]....... .......[OKAY] sparse_attn -[OKAY] -............ [NO]sparse_attn ................... [OKAY][NO] - ....... sparse_attntransformer[OKAY] sparse_attn............ -............ ............[NO]transformer [NO] [NO] ................... ....... ....... [OKAY][NO][OKAY] [OKAY] - -....... - [OKAY]stochastic_transformer -transformer transformer ......................... stochastic_transformer [NO][NO] .[NO] ....... .......[NO] .......[OKAY] [OKAY] -[OKAY] -....... - stochastic_transformer[OKAY] stochastic_transformer - . .[NO] [NO]....... .......[OKAY] -[OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja -JIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ------------------------------------------------------------------------------------------------------------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -op name - op nameop name................ op name ................ ................ installedinstalled ................ installed .... installed .. compatible compatible -compatible -.. - ------------------------------------------------------------------------------------------------------------------------------------------------------compatible - - - --------------------------------------------------- -cpu_adamcpu_adamcpu_adam cpu_adam............... ............... [YES].............................. [YES] ......[YES][YES]...... [OKAY] ...... -......[OKAY] -[OKAY][OKAY] - -fused_adam ............. [NO] fused_adam....... .............fused_adam fused_adam[OKAY]............. - [NO].............[NO] fused_lamb....... [NO]....................[OKAY] -.......[NO][OKAY] -.......fused_lamb[OKAY] [OKAY]fused_lamb - -............. .............[NO]fused_lamb .......[NO]............. [OKAY].......[NO] -sparse_attn [OKAY]................... - [NO][OKAY] -....... [OKAY] -sparse_attn ............ transformer[NO] sparse_attn...................sparse_attn [OKAY][NO]........................ - .......[NO]transformer [NO] .......[OKAY] -............ ....... [NO][OKAY] stochastic_transformer - .......[OKAY] ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -.transformer[OKAY] transformer[NO] -------------------------------------------------------------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - - ............................... [OKAY] [NO]stochastic_transformer -[NO] ............... [OKAY][OKAY][NO] ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - - - - ....... stochastic_transformer[OKAY] - stochastic_transformer . . [NO][NO] .............. [OKAY][OKAY] - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................ ................................ ................ installed installedinstalled installed .. .. .... compatiblecompatiblecompatible - - -compatible---------------------------------------------------------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- - -cpu_adamcpu_adam cpu_adamcpu_adam.............................. [YES]..............................[YES] ......[YES]......[YES] [OKAY]......[OKAY]...... - - [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO]fused_adam[NO] fused_adam .................... .......[NO] ............. [OKAY] [NO][OKAY]....... - - .......[OKAY] -[OKAY]fused_lambfused_lamb - fused_lamb.......................... fused_lamb ............. [NO][NO][NO]............. ..............[NO]....... [OKAY] - [OKAY][OKAY]....... - - [OKAY] -sparse_attnsparse_attnsparse_attn sparse_attn.................................... ............[NO][NO] [NO] [NO]..................... ....... [OKAY] - [OKAY][OKAY][OKAY]transformer - - - ............ transformertransformertransformer[NO] ........................................... [NO][NO][NO][OKAY] -..................... [OKAY][OKAY][OKAY]stochastic_transformer - - -.stochastic_transformerstochastic_transformer stochastic_transformer [NO]. .........[NO] [NO][NO] [OKAY]....... ....... -....... [OKAY] [OKAY] -[OKAY] - -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY] -[OKAY][OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name ................................ ................ ................ installedinstalled installed installed.... .. compatible - compatiblecompatible -------------------------------------------------- - -..---------------------------------------------------------------------------------------------------- - - -compatible --------------------------------------------------- -cpu_adamcpu_adamcpu_adam ............................................. [YES][YES] [YES]cpu_adam...... ...... ............... [OKAY]......[OKAY] - [YES] -[OKAY] -...... [OKAY] -fused_adam fused_adam.............fused_adam .............[NO]............. [NO]fused_adam[NO] .................... ....... [OKAY] [NO]....... -[OKAY] -.......[OKAY]fused_lamb - [OKAY]fused_lamb -............. fused_lamb ............. [NO] ............. [NO]fused_lamb[NO]....... ........................... [OKAY] [NO][OKAY] -[OKAY] - -....... [OKAY] -sparse_attnsparse_attnsparse_attn ........................ sparse_attn............[NO] [NO] [NO]............ ....... ....... [NO] [OKAY]....... [OKAY] -....... - [OKAY][OKAY]transformer - -transformer ........................transformer transformer[NO][NO]............ .......................... [NO] [NO] [OKAY] .......[OKAY] -....... - [OKAY][OKAY] -stochastic_transformer - stochastic_transformer ..stochastic_transformer stochastic_transformer [NO] [NO] ......... ....... [NO][OKAY] [NO] - [OKAY].............. - [OKAY][OKAY] - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- - -op nameop nameop name op name ................ ................................ ................installed installed..installedinstalled ..compatible.... - compatiblecompatible-------------------------------------------------- -compatible - ----------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adam ............... cpu_adam[YES]cpu_adam cpu_adam............... ...... [YES]............... ............... ...... [OKAY][YES] - [YES] [OKAY] ...... -...... [OKAY][OKAY] - -fused_adam ............. [NO] ....... fused_adamfused_adam[OKAY] fused_adam -............. .............fused_lamb............. [NO]............. [NO].......[NO][NO] ....... [OKAY] .............. -[OKAY] [OKAY][OKAY] -fused_lamb - - fused_lamb............. fused_lamb.............[NO] [NO] ........................... [NO] [OKAY] [OKAY] -....... - [OKAY]sparse_attn - ............ [NO] ....... [OKAY] -transformer sparse_attnsparse_attn............ sparse_attn........................[NO] ...................[NO] [NO][OKAY] -[NO]....... ..............stochastic_transformer [OKAY] [OKAY] - -[OKAY]. - transformer[NO]transformertransformer ........................ ....... ............[NO][OKAY][NO] -[NO] .............. .......[OKAY][OKAY] - -[OKAY] -stochastic_transformerstochastic_transformer stochastic_transformer .. . [NO] [NO] [NO].............. [OKAY][OKAY]....... - - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -utils .................. [YES] ...... [OKAY] -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninja--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op name op name ................ op name................ ................ installedinstalled................ ..installed..installed compatible compatible.. - -.. ----------------------------------------------------------------------------------------------------compatiblecompatible - - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam .............................. cpu_adamcpu_adam [YES] [YES]............... ...........................[YES] [OKAY][YES][OKAY] - ...... - ......[OKAY] -[OKAY] -fused_adam .............fused_adam [NO] .............fused_adamfused_adam....... ............. .............[OKAY][NO] - [NO][NO]....... fused_lamb ....... ....... [OKAY]............. [OKAY] -[OKAY] -[NO] - fused_lamb....... fused_lambfused_lamb ............. [OKAY]............. .............[NO] - [NO][NO]....... ....... ....... [OKAY] [OKAY] -[OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn ............sparse_attntransformersparse_attn [NO]........................ ............ ....... [NO] [NO][NO] [OKAY] .............. - ....... [OKAY] -transformer[OKAY][OKAY] - -............transformer stochastic_transformer[NO]............ transformer [NO].................... [OKAY].......[NO][NO] - [OKAY].............. -stochastic_transformer [OKAY][OKAY]stochastic_transformer. - - [NO] .stochastic_transformer....... [NO][OKAY] . -....... [NO][OKAY] -....... [OKAY] --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -JIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - -------------------------------------------------------------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -------------------------------------------------------------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -torch version .................... 1.8.1 -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -torch cuda version ............... 11.1 -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja --------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - ---------------------------------------------------op name -op name -op name................ op name................ ................ installed installed ................installed .. .. installed..compatible -compatible..compatible-------------------------------------------------- - - ---------------------------------------------------compatible-------------------------------------------------- - - --------------------------------------------------- -cpu_adam cpu_adam...............cpu_adam cpu_adam...............[YES]............... ...............[YES]......[YES] [YES]......[OKAY] ...... -[OKAY]...... - [OKAY][OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -fused_adam fused_adam.............fused_adam fused_adam [NO] ............. .................................[NO] [NO][OKAY][NO]....... -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------JIT compiled ops requires ninja - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja-------------------------------------------------- - - --------------------------------------------------- -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at -....... ....... fused_lamb [OKAY][OKAY] [OKAY] -............. - - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [NO] fused_lamb.......fused_lamb fused_lamb ............. .............[OKAY] ............. -[NO] [NO] [NO] ....... ....... ....... [OKAY] -[OKAY][OKAY] - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -sparse_attn ............ [NO] ....... sparse_attn[OKAY]sparse_attn -[OKAY]-------------------------------------------------- - - - -op name------------------------------------------------------------------------------------------------------------------------------------------------------ - - -................op name op nameop nameinstalled ................ .................. ................ installedcompatibleinstalled - ..installed-------------------------------------------------- .. -compatible .. -sparse_attn ........................transformer............ ............[NO][NO][NO] [NO]....... ....... ....... .......[OKAY] [OKAY][OKAY][OKAY] - - - -compatible - --------------------------------------------------compatible-------------------------------------------------- - - --------------------------------------------------- -transformertransformer transformerstochastic_transformer............ .........................[NO] [NO][NO][NO] ....... ..................... [OKAY] [OKAY] -[OKAY][OKAY] - - -cpu_adam ............... [YES] ...... cpu_adamcpu_adam[OKAY]cpu_adam -............... ............... ............... [YES] [YES] [YES] ...... ...... ...... [OKAY] [OKAY] -[OKAY] - -stochastic_transformerstochastic_transformerstochastic_transformer ... [NO][NO][NO] ..................... [OKAY][OKAY][OKAY] - - -fused_adam ............. [NO] ....... [OKAY]fused_adam -fused_adam fused_adam ............. .............fused_lamb ............. [NO] [NO] .............[NO] ....... ....... [NO] ....... [OKAY] [OKAY] ....... - -[OKAY] [OKAY] - -fused_lambfused_lamb ..........................fused_lamb [NO][NO]............. ..............[NO] [OKAY][OKAY]....... - - [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn transformersparse_attn ............ ........................sparse_attn [NO][NO][NO] ................... .......[OKAY] ....... -[OKAY] [NO] -[OKAY] -....... stochastic_transformer[OKAY]transformertransformer - ......................... transformer [NO][NO][NO] ............ .............. [NO]....... [OKAY] [OKAY] -.......[OKAY] - -[OKAY] -stochastic_transformerstochastic_transformer stochastic_transformer .. . [NO] [NO] [NO] ....... ....... ....... [OKAY] [OKAY] -[OKAY] - -ninjaninjaninjaninja .................................... ....................................[OKAY][OKAY] -[OKAY] -[OKAY]-------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- ---------------------------------------------------op name -op name - op name................................op name ................installed installed ................ ..installed .. compatibleinstalled.. - compatible..--------------------------------------------------compatible - -------------------------------------------------- - -compatible - --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... cpu_adam[YES] cpu_adam..................... cpu_adam ...............[OKAY] [YES] - .....................[YES] [OKAY] [YES] -......fused_adam ......[OKAY]............. - [OKAY][NO]fused_adam -....... .............[OKAY] -fused_adam[NO] fused_lamb ................................. fused_adam [OKAY][NO].............[NO] -[NO] ....... ....... .......fused_lamb [OKAY][OKAY] - -.............[OKAY] -[NO]fused_lamb fused_lamb ....... ............. ............. [OKAY] [NO] -[NO]sparse_attn .......................... [OKAY][OKAY] -sparse_attn[NO] - ................... [NO][OKAY] -....... [OKAY] -transformer ............transformer sparse_attn sparse_attn[NO] ........................................... [NO][OKAY][NO][NO] - ....... .......stochastic_transformer[OKAY] ....... [OKAY] - [OKAY]. - transformer -[NO] transformer ................... stochastic_transformer [NO] ............. [OKAY] [NO]....... -[NO] .......[OKAY]....... - [OKAY] -[OKAY] -stochastic_transformer .stochastic_transformer [NO] ........ [NO][OKAY] -....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] -[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op nameop name-------------------------------------------------- - -................................ op name op nameinstalledinstalled .................... ................ compatibleinstalled - compatible -------------------------------------------------- -installed -..-------------------------------------------------- -..compatible -compatible-------------------------------------------------- - ---------------------------------------------------cpu_adam - ...............cpu_adam [YES]............... ......[YES]cpu_adam ......[OKAY] ............... -[OKAY]cpu_adam - [YES]............... ......[YES] [OKAY]...... - [OKAY]fused_adam - fused_adam............. .............[NO] [NO]....... fused_adam .......[OKAY]fused_adam -..........................[OKAY] - [NO][NO]fused_lamb ..............fused_lamb............. .............[OKAY][NO][OKAY] - -[NO]....... .......[OKAY]fused_lambfused_lamb - [OKAY] ............. -............. [NO] ....... [OKAY] -[NO] ....... [OKAY]sparse_attn - sparse_attn............ ............[NO] [NO]sparse_attn....... ....... ............[OKAY][OKAY] - sparse_attn -[NO] ............transformer....... transformer [NO] [OKAY]............ ............ - ....... [NO] [NO] transformer.......[OKAY] -.......[OKAY]............ -transformer[OKAY] -[NO]............ .......stochastic_transformer[NO] [OKAY]stochastic_transformer -........ [NO][OKAY] .stochastic_transformer....... - [NO][OKAY] -........stochastic_transformer [OKAY][NO] -. .......[NO] [OKAY]....... - [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -DeepSpeed C++/CUDA extension op report------------------------------------------------------------------------------------------------------------------------------------------------------ - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -op name - op nameop name................ op name ................................installed installed................ installed.. installed .... compatible compatible.. - compatible ---------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adam [YES]cpu_adam ............... .................................... [YES] [OKAY] [YES][YES]...... -...... ......[OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] ....... fused_adam[OKAY] -.............fused_adamfused_adam [NO]..........................fused_lamb ....................[NO][NO] [NO][OKAY].............. -[OKAY][OKAY]....... - - [OKAY]fused_lamb - .............fused_lambfused_lamb [NO] ............. ............. ....... [NO] [NO] [OKAY]sparse_attn ....... - ....... ............[OKAY][OKAY] - -[NO] ....... [OKAY] -sparse_attn ............transformer [NO]............ .......sparse_attn [NO] [OKAY]sparse_attn................... - [NO][OKAY]............ -transformer ....... [NO] stochastic_transformer [OKAY] ....... -............ . [OKAY] [NO] - [NO]....... transformertransformer [OKAY]....... - ........................ [OKAY] [NO]stochastic_transformer - [NO]....... ........[OKAY] -[NO][OKAY] -....... [OKAY]stochastic_transformer - stochastic_transformer .. [NO] [NO]....... [OKAY]....... - [OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name ................ ................................................installed installedinstalledinstalled.. .. ....compatible -compatiblecompatible-------------------------------------------------- - - -compatible---------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adam[YES] ............... ......cpu_adam ...............[YES] [OKAY] -.....................[YES] [OKAY]...... -[YES] [OKAY]......fused_adam - [OKAY]............. -[NO]fused_adam .................... [OKAY][NO] -fused_adam ....... fused_lamb[OKAY]............. - fused_adam.............[NO]fused_lamb [NO]................................. .......[NO] [NO][OKAY].......[OKAY] - -.......[OKAY] -[OKAY]fused_lamb - ............. fused_lamb[NO] .................... [NO][OKAY] -sparse_attn....... sparse_attn ............ [OKAY]............ [NO] - [NO]....... .......[OKAY] -sparse_attn[OKAY] -transformer............ ............transformer[NO] sparse_attn [NO]................... ............[NO] ....... [OKAY][NO][OKAY]....... - - .......[OKAY] transformer[OKAY] - -stochastic_transformer............ stochastic_transformer[NO]transformer. ....................[NO] [NO] [OKAY] .......[NO] -....... [OKAY].......[OKAY] -stochastic_transformer - [OKAY] -. [NO] stochastic_transformer....... [OKAY] -. [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name -op name op name op name................ ................ ................installed installed................ installed .. installed.. ..compatiblecompatible.. - - --------------------------------------------------compatible ---------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... ...............[YES] [YES]cpu_adam......cpu_adam .....................[OKAY]............... [OKAY] -[YES] - [YES]...... ......[OKAY] -[OKAY] -fused_adamfused_adam .......................... [NO][NO] fused_adam.............. [OKAY][OKAY]fused_adam............. - - fused_lamb.............[NO] fused_lamb[NO]............. ....................[NO]....... [NO] [OKAY].......[OKAY] -.......[OKAY] - -fused_lamb[OKAY] - fused_lamb............. .............[NO] [NO]....... .......[OKAY] -[OKAY]sparse_attn - sparse_attn............ ............[NO] [NO]....... .......[OKAY] -[OKAY] -sparse_attn transformersparse_attntransformer ............ ........................ ............[NO] [NO] [NO] [NO] ............................ [OKAY] [OKAY][OKAY] - - -[OKAY]stochastic_transformertransformer - stochastic_transformer ............. transformer. [NO] [NO] ............[NO] ....... .............. [NO][OKAY] - [OKAY] [OKAY] -....... - [OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY] [OKAY] - -[OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op nameop name - op name ................ ................op name ................ installedinstalledinstalled................ .. .. installedcompatiblecompatible.. -.. - compatible----------------------------------------------------------------------------------------------------compatible - - - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... ...............[YES]cpu_adam [YES].....................cpu_adam ............... [OKAY]...... [YES] -[YES] [OKAY] ...... -...... [OKAY] -[OKAY] -fused_adam ............. fused_adam[NO] .......fused_adam ............. [OKAY] .............fused_adam -[NO] [NO].............fused_lamb....... .......[NO] ............. [OKAY][OKAY] ....... - -[NO] [OKAY].......fused_lambfused_lamb - .............[OKAY]fused_lamb............. - [NO] .............[NO] ..............[NO] [OKAY][OKAY]....... -sparse_attn - ............ [NO][OKAY] -....... [OKAY] -transformer sparse_attn............sparse_attn [NO]........................ .......[NO][NO] sparse_attn [OKAY].............. - ............[OKAY][OKAY] -[NO] -stochastic_transformertransformer .......transformer............. [OKAY] [NO] -............[NO] transformer.......[NO]....... [OKAY]....... -[OKAY]............ -[OKAY]stochastic_transformer[NO] - ....... .[OKAY] stochastic_transformer -[NO] ........ [OKAY][NO] - stochastic_transformer....... [OKAY] -. [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................................... .................. .................. [OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------op name -op nameop nameop name ................ ................ ................................ installed installed installedinstalled .. ....compatible.. -DeepSpeed general environment info: - compatible -------------------------------------------------- -compatiblecompatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES]cpu_adam cpu_adamcpu_adam ...... ............... ..............................[OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [YES][YES][YES] .................. [OKAY][OKAY][OKAY] - - -fused_adam ............. [NO] ....... [OKAY] -torch version .................... 1.8.1 -fused_adamfused_adamfused_lamb .............fused_adam.......................... .............[NO][NO][NO] ....... [NO].............. [OKAY][OKAY].......[OKAY] - - -torch cuda version ............... 11.1 - [OKAY] -nvcc version ..................... 11.2 -fused_lamb fused_lamb............. .............[NO] fused_lamb [NO] ....... ............. ....... [OKAY] [NO]sparse_attn[OKAY] - -................... [NO][OKAY] ....... - [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer sparse_attnsparse_attn............ ........................ [NO] [NO] sparse_attn[NO] ....... ..........................[OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -[OKAY][OKAY][NO] - - ....... stochastic_transformertransformer[OKAY]transformer - ............ .............[NO]transformer ....... [NO]............ [NO] [OKAY]..............[NO] - [OKAY].......[OKAY] - -stochastic_transformer[OKAY] -stochastic_transformer. [NO]stochastic_transformer . ....... .[OKAY][NO] - [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. ..................[OKAY] -[OKAY] [OKAY] -[OKAY]-------------------------------------------------- --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op nameop name - - ................op name................op name installed................installed................ ....installedinstalled compatible .. -compatible.. --------------------------------------------------compatible - - -compatible-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adamcpu_adamcpu_adam[YES] ................................................... [YES][YES][YES][OKAY] -.................. [OKAY][OKAY][OKAY] - - -fused_adam .............fused_adamfused_adamfused_adam .............[NO].......................... .......[NO][NO][NO] ....... [OKAY] ....... -[OKAY]....... [OKAY] -[OKAY] - -fused_lamb fused_lambfused_lamb............. fused_lamb ............. .............[NO] ............. [NO][NO].......[NO] .....................[OKAY] -[OKAY][OKAY][OKAY] - - -sparse_attnsparse_attnsparse_attnsparse_attn ................................................ [NO][NO][NO][NO] .............. .............. [OKAY][OKAY][OKAY][OKAY] - - - -transformertransformertransformertransformer ................................................ [NO][NO][NO][NO] ....... .............. .......[OKAY][OKAY] [OKAY] - -[OKAY] - -stochastic_transformerstochastic_transformerstochastic_transformer stochastic_transformer .... [NO][NO][NO][NO] ............................ [OKAY][OKAY] [OKAY] - -[OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- - - --------------------------------------------------- -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report------------------------------------------------------------------------------------------------------------------------------------------------------ -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - -JIT compiled ops requires ninja -ninjaninjaninja ninja ...................................................... [OKAY] .................. -[OKAY][OKAY] ---------------------------------------------------[OKAY] - ----------------------------------------------------------------------------------------------------- - - -op name--------------------------------------------------op name op name -................................ op name ................installed................ installed installed ..installed.. compatible.... -compatible -------------------------------------------------- -compatiblecompatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam ...............cpu_adam............... cpu_adam............... [YES][YES] [YES] ..................... [YES]......[OKAY] ...... -......[OKAY] -[OKAY][OKAY] - -fused_adam ............. fused_adam[NO] fused_adam.............fused_adam....... .............[NO].............[OKAY] -.......[NO][NO] [OKAY].............. - fused_lamb [OKAY] -fused_lamb[OKAY]............. - .............fused_lamb [NO] fused_lamb[NO] ............. ........................... [NO][OKAY][OKAY] [NO] - - .............. [OKAY][OKAY] - -sparse_attnsparse_attn ........................ [NO]sparse_attn [NO]....... sparse_attn ....... [OKAY] ........................ -[OKAY] [NO] -[NO] transformer ....... .......[OKAY]transformer............ - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [OKAY][NO] - ............transformer....... transformer [NO]............ [OKAY] ................... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`................ [NO] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - ....... -[NO] -[NO] [NO][OKAY]stochastic_transformer ....... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY]async_io - async_io............... [NO]............... .......[NO] utils[NO]....... - ..................[NO] -....... .[OKAY] stochastic_transformer -[OKAY] [NO] -async_io ............... [NO] ....... [NO] -[YES] ...... [OKAY] -quantizer transformer_inference.............. ..[NO] transformer_inference[NO]....... .........[OKAY] - ........stochastic_transformer stochastic_transformer [OKAY][NO] -......... [NO][NO][OKAY] -async_io ............... [NO] ....... [NO] -[NO][OKAY] -....... [OKAY]-------------------------------------------------- - -.............. [OKAY][OKAY] - -transformer_inference .. [NO] ....... [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... -transformer_inference .. [NO]utils ......................... [OKAY][YES] - [OKAY] - ...... [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -utilsquantizer ................................ [YES][NO] ............. [OKAY][OKAY] - -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- -ninjaninja .................................... [OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -op name op name................ ................installed installed.. ..compatible -compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adam cpu_adam............... [YES]............... ...... [OKAY][YES] - ...... [OKAY] -fused_adam ............. [NO] fused_adam....... [OKAY]............. - [NO] fused_lamb....... ............. [OKAY][NO] -....... [OKAY]fused_lamb - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -sparse_attn transformer............ ............ [NO][NO] .............. [OKAY][OKAY] - -transformer_inference .. [NO] ....... [OKAY] -transformerstochastic_transformer ............ .[NO] [NO] .............. [OKAY][OKAY] - -utils .................. [YES] ...... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -JIT compiled ops requires ninjaJIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] -[OKAY][OKAY] --------------------------------------------------- - -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -op name -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 -op nameop nameop name ................ ................................................ installedinstalled installed..installed.. ..compatiblecompatible.. - -compatible -quantizer .............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 ----------------------------------------------------------------------------------------------------- -compatible --------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -cpu_adam cpu_adam............... cpu_adam cpu_adam...............[YES] .....................[YES] [OKAY]...............[YES]...... [YES] - ......[OKAY]...... - [OKAY][OKAY] - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_adam ............. [NO] ....... fused_adam[OKAY] fused_adam -.............fused_adam [NO].......................... fused_lamb .......[NO] ............. [NO][OKAY] -.......[NO]....... fused_lamb.......[OKAY][OKAY] -............. -[OKAY] -[NO]fused_lambfused_lamb .................... ............. [OKAY] [NO] -[NO] .............. [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -sparse_attn ............ transformer[NO]sparse_attn sparse_attn ........................................... [NO] [NO][OKAY] [NO] -async_io ............... [NO] ....... [NO] -....... ....... .......transformer [OKAY] -............[OKAY] [OKAY] -[NO] -transformer_inference .. [NO] ....... [OKAY] - stochastic_transformer....... transformer transformer [OKAY] ......................... [NO] - [NO].......[NO] stochastic_transformer[OKAY] ....... -utils .................. [YES] ...... [OKAY] -....... [OKAY][OKAY] -. - [NO] .......stochastic_transformer [OKAY]stochastic_transformer -. [NO]. .......[NO] [OKAY]....... - [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ...................................................... [OKAY][OKAY]..................[OKAY] - - ---------------------------------------------------[OKAY]-------------------------------------------------- - --------------------------------------------------- -op name --------------------------------------------------- op nameop name................ - ................................installedop name installedinstalled .. .................. .. compatibleinstalledcompatiblecompatible - - -------------------------------------------------------------------------------------------------------------------------------------------------------.. - - - compatible --------------------------------------------------- -cpu_adam cpu_adamcpu_adam............... ..............................cpu_adam[YES] [YES][YES]............... ...... ...... [YES] ......[OKAY] [OKAY] -......[OKAY] - -[OKAY] -fused_adam .............fused_adam fused_adam[NO] ............. fused_adam............. ....... [NO] [NO] [OKAY].................... - .......[NO][OKAY] fused_lamb -[OKAY] ....... -............. fused_lamb [OKAY] fused_lamb -[NO]............. fused_lamb....................[NO] .............[OKAY].......[NO] - [OKAY].......[NO] - .......[OKAY] -[OKAY] -sparse_attn ............ [NO] ....... [OKAY]sparse_attn - ............ transformersparse_attn[NO] sparse_attn ............ ............................... [NO][OKAY] [NO] -.......[NO] .......[OKAY] ....... -transformer[OKAY] -[OKAY]............ -stochastic_transformer transformertransformer[NO] . ................... ............ [NO] [NO] [OKAY] [NO].............. - .......[OKAY][OKAY] -stochastic_transformer - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [OKAY] -async_io ............... [NO] ....... [NO] -.stochastic_transformer [NO].stochastic_transformer .......[NO] ........[OKAY] -[NO][OKAY] -....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op report - - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja -JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninja ninja .................................... .................. .................. [OKAY] [OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop name op name op name................ ................ ................ ................installed installedinstalledinstalled.. ....compatible.. -compatiblecompatiblecompatible --------------------------------------------------- - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -cpu_adam cpu_adamcpu_adamcpu_adam ............... .............................. ...............[YES] [YES] [YES] ............[YES]...... [OKAY] [OKAY] -...... - [OKAY][OKAY] - -fused_adamfused_adam ............. fused_adam............. fused_adam .............[NO].............[NO] [NO].......[NO] ....... .......[OKAY]....... - [OKAY][OKAY][OKAY] - - -fused_lamb fused_lamb.............fused_lamb fused_lamb............. ............. [NO]............. [NO] [NO] ....... [NO] .............. [OKAY] ....... -[OKAY][OKAY] - -[OKAY] -sparse_attnsparse_attnsparse_attn ............sparse_attn ............[NO]............ ............ [NO] [NO]....... [NO] .............. [OKAY] ....... -[OKAY][OKAY] - -[OKAY]transformer -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name - transformer............transformer transformer ........................ [NO] ............ [NO][NO] ....... ....... [OKAY][NO] ....... -[OKAY] -[OKAY]....... - stochastic_transformer[OKAY] stochastic_transformer -op name op name................ op name ................ ................installed ................ installedinstalled.. installed ..compatible.. -.. compatible-------------------------------------------------- - compatible ---------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- - -stochastic_transformer . .stochastic_transformer[NO]. [NO]....... [NO]. ....... [NO][OKAY] ....... -[OKAY]....... -cpu_adamcpu_adam .............................. cpu_adam[YES]cpu_adam ............... ......[YES] ............... [YES]...... [OKAY][OKAY] [YES] -...... -[OKAY] -[OKAY] - ......[OKAY] -[OKAY] -fused_adam ............. fused_adam[NO] ....................fused_adamfused_adam [OKAY] [NO]............. -............. .......[NO][NO]fused_lamb [OKAY] ............. -....... ....... fused_lamb[NO] [OKAY][OKAY]............. -....... - [NO]fused_lamb[OKAY] -.......fused_lamb ..........................[OKAY] -[NO][NO] ....... .......[OKAY] -[OKAY]sparse_attn - ............ [NO] ....... sparse_attn[OKAY] -............ [NO]transformer .......sparse_attn............ sparse_attn [OKAY] - [NO]........................transformer .......[NO]............[NO] [NO][OKAY]....... ....... - ....... [OKAY] [OKAY] -stochastic_transformer -[OKAY] -transformer. transformerstochastic_transformer ............ [NO][NO]............. ..............[NO][NO] [OKAY][OKAY]....... -....... -[OKAY] -[OKAY]stochastic_transformer - . [NO]stochastic_transformer ....... [OKAY]. - [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ...................................................... [OKAY].................. -[OKAY][OKAY] - ---------------------------------------------------[OKAY] ----------------------------------------------------------------------------------------------------- - -op name ---------------------------------------------------op name op name - ................ ................op name ................ installed installed................installed ......installed compatible compatible -compatible.. - -------------------------------------------------- -----------------------------------------------------------------------------------------------------compatible - - - --------------------------------------------------- -cpu_adamcpu_adamcpu_adam ...............cpu_adam............... ...............[YES] [YES] ............... [YES]............ [YES]......[OKAY][OKAY] - -......[OKAY] -[OKAY] -fused_adamfused_adam fused_adam ............. fused_adam............. ............. [NO] [NO]............. [NO] ..............[NO] .......[OKAY][OKAY] -....... - [OKAY]fused_lamb[OKAY] - -fused_lamb............. .............fused_lamb[NO] fused_lamb [NO] ....... ............. .............[OKAY]....... -[NO][OKAY][NO] - .............. [OKAY][OKAY] - -DeepSpeed general environment info: -sparse_attn ............sparse_attn [NO]sparse_attn ............sparse_attn....... [NO] ............ [OKAY]............[NO]....... - [NO][OKAY]....... -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer....... [OKAY] ............transformer -torch version .................... 1.8.1 - [OKAY] [NO]transformer -torch cuda version ............... 11.1 -............ ...................[NO] transformer [OKAY]....... [NO] ............[OKAY] - -.......[NO] stochastic_transformer[OKAY].......stochastic_transformer -nvcc version ..................... 11.2 - .[OKAY]. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [NO]stochastic_transformer[NO] stochastic_transformer............... [OKAY] [OKAY] - -.[NO] .......[NO] [OKAY]....... -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] - [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY] -......quantizer [OKAY].............. - [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO]utils ......................... [YES][OKAY] -...... [OKAY] -utilsquantizer ................................ [YES][NO] ............. [OKAY][OKAY] - ---------------------------------------------------quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -/bin/sh: line 0: type: git: not found --------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja - --------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.[NO] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] ....... [OKAY] -utils ..................async_io async_io [YES] .............................. ......[NO][NO] [OKAY].............. - [NO][NO] - -quantizer .............. [NO] ....... [OKAY] ---------------------------------------------------transformer_inference - transformer_inference.. ..[NO] [NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -/bin/sh: line 0: type: git: not found -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system --------------------------------------------------- - meet the required dependencies to JIT install the op. ---------------------------------------------------JIT compiled ops requires ninja - - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop name op name ................................................................ installedinstalledinstalledinstalled .. .... .. compatible compatiblecompatiblecompatible - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -cpu_adamcpu_adam cpu_adam...............cpu_adam............... [YES][YES].............................. ......[YES] ......[YES] ...... [OKAY][OKAY] -[OKAY]...... - - [OKAY] -fused_adamfused_adam fused_adam.............fused_adam............. .............[NO]............. [NO] .......[NO] [NO] ....... [OKAY] .......[OKAY] -....... - [OKAY][OKAY] -fused_lamb - fused_lamb.............fused_lamb [NO].............fused_lamb............. ....... [NO]............. [NO] [OKAY]....... - [NO].......[OKAY] -.......[OKAY] -[OKAY] -sparse_attnsparse_attn ........................sparse_attn [NO] sparse_attn[NO] ............ ....... ...................[NO][OKAY] [NO] -[OKAY] ....... -....... transformer[OKAY] transformer - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -[OKAY]............ -async_io ............... [NO] ....... [NO] -............ transformer[NO] transformer [NO] ....... ...................[OKAY]............ - [OKAY][NO][NO] -transformer_inference .. [NO] ....... [OKAY] - ..............stochastic_transformer [OKAY]stochastic_transformer[OKAY] -. -[NO].stochastic_transformer .......stochastic_transformer[NO] . [OKAY] -utils .................. [YES] ...... [OKAY] -........[NO] [OKAY][NO]....... - .......[OKAY] -[OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op nameop name op name - ................................................ op name installedinstalled installed .................... ..installed compatiblecompatible.. - - --------------------------------------------------compatible-------------------------------------------------- - - -compatible-------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] -------------------------------------------------- -...............cpu_adam...... [YES]...............[OKAY] -......[YES] ......[OKAY] -[OKAY] -fused_adam ............. [NO] ....... cpu_adam[OKAY]fused_adam - fused_adam............. [NO]fused_lamb............. ...................................[NO] [OKAY][NO]....... -....... [OKAY] - fused_lamb[YES][OKAY] sparse_attn -............. ..................[NO]fused_lamb [NO].................... .......[OKAY][OKAY][NO] - - [OKAY]....... - [OKAY] -transformer ............ [NO] sparse_attn....... ............[OKAY] -[NO]fused_adam sparse_attn ....... stochastic_transformer ............ [OKAY] ............. - .[NO][NO] .......transformer[NO] .......[OKAY]............ - ....... transformer[OKAY][NO] -[OKAY] -................... [NO] fused_lamb[OKAY]....... -[OKAY] -.............stochastic_transformer .stochastic_transformer[NO] .......[NO] ........ [NO][OKAY] -....... [OKAY] - [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] - .. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY]utils - .................. [YES] ......quantizer [OKAY].............. - [NO] .......quantizer [OKAY].............. - [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -async_io ............... [NO] ....... [NO] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inferenceutils .................... [NO][YES] ............. [OKAY][OKAY] - -quantizer utils.............. ..................[NO] [YES] ............. [OKAY][OKAY] - -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja .................. ....................................[OKAY].................. -[OKAY]-------------------------------------------------- [OKAY] - -[OKAY] ---------------------------------------------------op name --------------------------------------------------- -................ ---------------------------------------------------op nameop name - installed ................ op name................ .. installed ................installed compatible .. -installed..-------------------------------------------------- -compatible..compatible - ---------------------------------------------------compatible-------------------------------------------------- - - -cpu_adam-------------------------------------------------- -............... [YES]cpu_adam cpu_adam ...... cpu_adam .............................. [OKAY] ............... -[YES][YES] [YES] ............ [OKAY]......[OKAY]fused_adam - -[OKAY] ............. - [NO] ....... [OKAY]fused_adamfused_adam - fused_adam..........................fused_lamb .............[NO][NO]............. .......[NO]....... [NO][OKAY] -.......[OKAY]....... fused_lamb -[OKAY][OKAY] - -.............fused_lamb .............[NO]fused_lamb [NO]....... .............sparse_attn....... [OKAY][NO]............[OKAY] - - [NO]....... ....... [OKAY][OKAY] - -sparse_attn ............transformer sparse_attn[NO]............ ...................[NO]sparse_attn [OKAY].......[NO]............ - [NO][OKAY]....... - transformer ....... [OKAY] ............stochastic_transformer -[OKAY] -[NO]transformer. transformer ................... [OKAY][NO][NO] -............ ..............[NO]stochastic_transformer .......[OKAY][OKAY] -. -[OKAY] -[NO] stochastic_transformer....... stochastic_transformer[OKAY]. - [NO]. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja .................. .................. .................. ..................[OKAY][OKAY] - [OKAY] -[OKAY] ------------------------------------------------------------------------------------------------------------------------------------------------------- - - - ---------------------------------------------------op nameop name -op name ................ ................op name ................ installedinstalled ................installed.... compatiblecompatible..installed - ---------------------------------------------------compatible -..-------------------------------------------------- - ---------------------------------------------------compatible - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam cpu_adam.....................cpu_adam [OKAY]............... - [YES]...............[YES] ......[YES] ...... [OKAY] ...... -[OKAY]fused_adam [OKAY] - -............. [NO] ....... [OKAY] -fused_adam fused_lamb............. fused_adam.............fused_adam [NO] ............. .............[NO] [NO].......[NO] .......[OKAY]....... ....... - [OKAY][OKAY] - -fused_lamb[OKAY] -fused_lamb............. [NO].............fused_lamb .......[NO]............. sparse_attn[OKAY] ....... -[NO]............ [OKAY] ....... -[NO] [OKAY]....... - [OKAY] -sparse_attn ............transformer sparse_attn[NO]............ .......[NO]............ .......sparse_attn[OKAY] [NO] -[OKAY] -................... transformer[OKAY][NO]stochastic_transformer - .................... transformer[NO][OKAY][NO] - .......................... transformer [OKAY] [OKAY] - -[NO]............ .......[NO]stochastic_transformer [OKAY]....... - .[OKAY] -stochastic_transformer[NO] ....... .[OKAY]stochastic_transformer - [NO] ........ [NO][OKAY] -....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. ------------------------------------------------------------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - -async_io ............... [NO] ....... [NO] -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -transformer_inference .. [NO] ....... [OKAY] -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... ....... [NO][NO] -....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -/bin/sh: line 0: type: git: not found -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... utils[OKAY] -transformer_inference .. [NO] ....... [OKAY] -.................. [YES] quantizer...... ..............[OKAY] -[NO] ....... [OKAY]quantizer - .............. [NO] .......-------------------------------------------------- -[OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op nameop nameop name ................................ ................................ installedinstalledinstalled installed .. .. .... compatiblecompatiblecompatiblecompatible - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- - -cpu_adam cpu_adamcpu_adam............... cpu_adam ............... [YES] .............................. ......[YES] ......[OKAY][YES] -[YES] [OKAY]............ - [OKAY][OKAY] - -fused_adam ............. [NO] ....... [OKAY] -fused_adamfused_adamfused_adam ..........................fused_lamb ..........................[NO] [NO] [NO][NO].............. ....... .......[OKAY][OKAY] -[OKAY] -[OKAY] -fused_lamb - fused_lamb............. fused_lamb.............[NO] .................... [NO][NO]sparse_attn[OKAY] - .......................... [OKAY][NO][OKAY] - -....... [OKAY] -transformersparse_attn ........................ [NO]sparse_attn [NO] ....... sparse_attn............ ....... ............[OKAY] -[NO][OKAY][NO] ....... - stochastic_transformer[OKAY]....... transformer - [OKAY]............ .transformer -[NO][NO] ..........................transformer [NO][OKAY] - [OKAY]................... - stochastic_transformer[NO][OKAY] - ....... .[OKAY] stochastic_transformer -[NO] ........ stochastic_transformer [OKAY][NO] - ....... .[OKAY] -[NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -[YES] ...... [OKAY]quantizer - .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -1.8.1 -torch cuda version ...............torch cuda version 11.1............... - nvcc version11.1 -async_io ............... [NO] ....... [NO] -.....................nvcc version 11.2..................... - deepspeed install path11.2 -transformer_inference .. [NO] ....... [OKAY] -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info -utils .................. [YES] ...... [OKAY] - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY] -....... [OKAY] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY] -[OKAY] -[OKAY]-------------------------------------------------- - --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name -op name - op nameop name ................ ................................installed................ .. installed installedinstalled compatible - ......-------------------------------------------------- -compatiblecompatiblecompatible - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adamcpu_adam[YES] .................................... ............... [YES][YES] [OKAY] [YES]...... -...... ......[OKAY][OKAY] - -[OKAY] -fused_adam .............fused_adam fused_adam[NO] fused_adam ............. ....... .......................... [NO] [OKAY] -[NO].......[NO] fused_lamb.......[OKAY] ....... -............. [OKAY] [OKAY][NO] -fused_lamb - ....................fused_lamb [OKAY][NO]fused_lamb............. - ....................[NO] [OKAY] [NO] -....... .......[OKAY] -[OKAY] -sparse_attn ............ [NO] .......sparse_attn [OKAY]............sparse_attnsparse_attn - [NO]............transformer ............ ................... [NO] [OKAY] [NO][NO] - ....... ....... transformer....... [OKAY]............[OKAY] -[OKAY] - -[NO] transformer.......stochastic_transformer transformer[OKAY]............ -. ............ [NO] [NO]stochastic_transformer[NO] ....... ............... [OKAY][NO][OKAY][OKAY] - - -....... [OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO]async_io - ...............async_io [NO]............... .......[NO] [NO]....... - [NO]transformer_inference - .. [NO] ....... [OKAY] -transformer_inference .. [NO]transformer_inference utils ......... ..................[OKAY][NO] - [YES]....... ......[OKAY] -[OKAY]utils - .................. [YES]quantizer utils ...... .............. .................. [OKAY] [NO] -[YES] ............. [OKAY][OKAY] - -quantizer .............. quantizer--------------------------------------------------[NO] - ..................... [NO][OKAY] -....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY] -utils .................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY] -quantizer .............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference .. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [OKAY]quantizer -async_io ............... [NO] ....... [NO] - .............. [NO]quantizer ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version .................... 1.8.1 -transformer_inference .. [NO] ....... [OKAY] -.............. [NO] .......-------------------------------------------------- -[OKAY] -torch cuda version ............... 11.1 -async_ioutils ................................. [NO][YES] ............. [NO][OKAY] --------------------------------------------------- -nvcc version ..................... 11.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -quantizer .............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] async_io....... [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY]quantizer -quantizer .............. [NO] ....... [OKAY] - .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................. ..................[YES] [YES]...... ......[OKAY] - [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY] -[NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ...... [OKAY] -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path ............... - torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version .................... 1.8.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch cuda version torch version............... ....................11.1 -1.8.1nvcc version - ..................... torch cuda version11.2 -...............deepspeed install path 11.1........... - nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - deepspeed info11.2 -...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] ...... - deepspeed infotorch 1.8, cuda 11.1 -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -utils .................. [YES] ...... [OKAY] -torch version .................... 1.8.1 -quantizer .............. [NO] ....... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 -quantizer .............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op nameop name -op name ................op name................................ ................installed installed.. installedinstalled ..compatible .. -.. compatible--------------------------------------------------compatible - - -compatible-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adam ...............cpu_adam ............... [YES] ...............cpu_adam [YES] ...... [YES][OKAY]..................... - ......[YES][OKAY] -[OKAY]...... - [OKAY] -fused_adam ............. [NO] fused_adam .......fused_adam ............. [OKAY] ............. -[NO] [NO].......fused_adamfused_lamb ............. .......[OKAY]............. - [OKAY][NO][NO] - fused_lamb..............fused_lamb [OKAY] - .............[OKAY]............. -fused_lamb[NO] [NO]............. .......[NO]....... [OKAY] ....... - [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attntransformer sparse_attn ............ ........................[NO] [NO][NO]sparse_attn ....... ....... ....... [OKAY] [OKAY] -[OKAY]............ - - transformer[NO] transformer............ ............[NO] stochastic_transformer....... [NO].......[OKAY]. ....... - [OKAY] [NO] -[OKAY] transformer -.......stochastic_transformer stochastic_transformer[OKAY]............. - [NO][NO] . ....... ....... [NO] [OKAY] [OKAY] -....... - [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------JIT compiled ops requires ninja--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils ..................quantizer [YES].............. ......[NO] [OKAY]....... - [OKAY] -quantizer ..............-------------------------------------------------- -[NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] -.. [NO] ....... [OKAY]utils - .................. [YES] ......utils [OKAY].................. - [YES] ......quantizer [OKAY].............. - [NO] .......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - -DeepSpeed general environment info: --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -async_io ............... [NO] ....... [NO] -op nameop name op nameop name ................ ................ ................................ installed installed installedinstalled .. .. .... compatible compatiblecompatible -compatible - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- - -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -cpu_adamcpu_adam cpu_adamcpu_adam ............... ............... ............... [YES]............... [YES] [YES][YES] ...... ...... ............ [OKAY] -[OKAY][OKAY][OKAY] - - -torch cuda version ............... 11.1 -quantizer .............. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -fused_adam .............fused_adamfused_adamfused_adam [NO]....................................... .......[NO][NO][NO] [OKAY]....... ....... - ....... [OKAY] [OKAY] -[OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -fused_lamb -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - fused_lambfused_lamb............. .............[NO].............fused_lamb .......[NO]............. [NO] .......[OKAY][NO]....... - [OKAY].......[OKAY] -async_io ............... [NO] ....... [NO] - -[OKAY] -transformer_inference .. [NO] ....... [OKAY] -sparse_attn sparse_attn............sparse_attn ............sparse_attn[NO] ...............................[NO] [NO] [OKAY] [NO]....... -.............. [OKAY][OKAY]transformer[OKAY] -utils .................. [YES] ...... [OKAY] - - -quantizer .............. [NO] ....... [OKAY] -............transformer transformertransformer[NO] ............ ............ ............ ....... [NO][NO] [NO] [OKAY]....... --------------------------------------------------- -....... ....... [OKAY][OKAY][OKAY] -stochastic_transformer - - stochastic_transformer. stochastic_transformerstochastic_transformer[NO] . ........[NO]. [NO][OKAY].......[NO] - .......[OKAY]....... - [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... async_io[NO]async_io - .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] ....... [OKAY] -transformer_inference transformer_inference.. ..[NO] utils.......[NO] ..................[OKAY]....... - [YES] [OKAY]...... - [OKAY] -utils ..................utils quantizer [YES]................................ ......[YES][NO] [OKAY]...... -....... [OKAY][OKAY] -quantizer - .............. [NO]quantizer -------------------------------------------------- ....... -.............. [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja-------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. [OKAY].................. [OKAY] - [OKAY] -[OKAY]-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- -op name --------------------------------------------------- op nameop name -................ ................ op name................installed installed ................installed.... ..compatible installed -compatible compatible--------------------------------------------------.. - - -----------------------------------------------------------------------------------------------------compatible - - --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adam [YES] cpu_adam ..................... ..............................[OKAY] [YES] -[YES] [YES] ...... ...... ...... [OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] ....... [OKAY] -fused_adamfused_adamfused_adam ............. fused_lamb .......................... [NO]............. [NO][NO] ....... [NO]....... ....... [OKAY]....... - [OKAY][OKAY][OKAY] - - -fused_lamb .............fused_lamb fused_lamb [NO] ............. ............. ....... [NO] sparse_attn[NO][OKAY] -.......................... [NO][OKAY][OKAY] -....... - [OKAY] -sparse_attntransformer ........................ [NO][NO] .............. sparse_attn[OKAY] sparse_attn -[OKAY] ............ -............transformer stochastic_transformer [NO]............[NO] [NO] ............... ....... [OKAY][OKAY][NO][OKAY] - - -.......transformer transformerstochastic_transformer[OKAY] - ........................ .[NO][NO] .......[NO]....... .......[OKAY][OKAY] -[OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... 1.8.1torch version - .................... torch cuda version1.8.1 -............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -................... deepspeed info0.4.2+bc17042, bc17042, big-science -................... deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. .................. [YES] - ...... [OKAY] -quantizer .............. [NO] ....... async_io[OKAY] - ............... [NO]-------------------------------------------------- -....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------op name - -op name op name ................op name................ ................................installedinstalled installed..installed .. compatible -..--------------------------------------------------..compatible - -compatiblecompatible --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adam ............... [YES] cpu_adamcpu_adam......cpu_adam ...............[OKAY].............................. - [YES][YES][YES] .................. [OKAY][OKAY][OKAY] - - -async_io ............... [NO] ....... [NO] -fused_adam ............. [NO] ....... fused_adamfused_adam[OKAY] fused_adam -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.utils -.......................... .............[NO] [NO] fused_lamb [NO]....... ....... .................... [OKAY] [OKAY] -[NO][OKAY] - - .................. [YES] ...... [OKAY] -.......fused_lamb [OKAY]fused_lamb............. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_lamb .............[NO]............. [NO][NO]....... ..............[OKAY] [OKAY] -[OKAY] - -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -sparse_attn ............ [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer sparse_attnsparse_attn............ sparse_attn ........................[NO] [NO] ............[NO]....... ....... [NO] [OKAY] [OKAY]....... -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] .......utils [OKAY].................. -....... - [OKAY][OKAY] - -transformer_inference .. [NO] ....... [OKAY] - [YES] ...... [OKAY] -stochastic_transformertransformertransformer transformer ............ ............. ............ [NO] [NO] [NO][NO]....... ....... ....... .......[OKAY] [OKAY] -[OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES]quantizer .................... [OKAY][NO] -[OKAY] - -quantizer .............. [NO] ....... [OKAY] - ....... quantizer[OKAY] -.............. [NO] .......-------------------------------------------------- -[OKAY] -stochastic_transformer stochastic_transformer.stochastic_transformer [NO]. ........[NO] [NO][OKAY]....... - .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info deepspeed install path................... ...........0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES] [YES]...... ......[OKAY] -[OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja -JIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................................... .................................... [OKAY] [OKAY][OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -[OKAY] - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................................................................ installed installedinstalled installed .... .. .. compatible -compatiblecompatiblecompatible-------------------------------------------------- - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- -DeepSpeed general environment info: - - -cpu_adam ............... cpu_adamcpu_adam[YES] cpu_adam ............... .................................... [YES][YES][YES][OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -...... ...... ...... [OKAY] [OKAY] -[OKAY] - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -fused_adam ............. [NO]fused_adam fused_adamfused_adam ....... ............. .......................... [NO][OKAY][NO] - .......[NO]....... fused_lamb.......[OKAY] [OKAY] -............. -DeepSpeed general environment info: -[OKAY] fused_lamb -[NO]fused_lamb ....................fused_lamb............. [NO] [NO] .............[OKAY] ....... -....... [NO] [OKAY] [OKAY] -....... - [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -sparse_attn ............sparse_attnsparse_attn sparse_attn [NO]........................ [NO]................... [NO] [NO] .......[OKAY] ....... ....... -[OKAY] -torch cuda version ............... 11.1 -[OKAY][OKAY]transformer - -nvcc version ..................... 11.2 -transformer............ transformer ............ transformer[NO] [NO]............................... .......[NO][OKAY][NO] - [OKAY].............. - [OKAY][OKAY] - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -DeepSpeed general environment info: -stochastic_transformerstochastic_transformer stochastic_transformerstochastic_transformer. . [NO] .. [NO] [NO] ....... [NO] .............. [OKAY] ....... -[OKAY] [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -[OKAY] - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. [YES] ......quantizer [OKAY].............. - [NO] .......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science -torch cuda versiontorch cuda version .............................. 11.111.1 - - -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version ............... ...............11.1 -11.1 ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -nvcc version nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - meet the required dependencies to JIT install the op. ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... quantizer[OKAY] -.............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninja--------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op name --------------------------------------------------................op name -DeepSpeed general environment info: - ................op name................installed ................ ..installed installed installed ..compatible .. - compatible--------------------------------------------------..compatible - - - ----------------------------------------------------------------------------------------------------compatible - - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1DeepSpeed general environment info: -torch cuda version -cpu_adam ............... cpu_adam[YES] cpu_adam..................... ............... cpu_adam[OKAY] [YES] -............... 11.1 -[YES] ............... ...... ...... [YES] [OKAY] [OKAY]fused_adam...... - - .............[OKAY] [NO] - ....... [OKAY]fused_adam -nvcc version torch install path..................... ...............11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -fused_adam ..........................fused_lamb [NO]fused_adam[NO]............. ....... ....................[NO] [OKAY][OKAY].......[NO] - - [OKAY]....... - -deepspeed info ...................torch version 0.4.2+bc17042, bc17042, big-science.................... -fused_lamb fused_lamb [OKAY] .......................... - deepspeed wheel compiled w.1.8.1 -...... torch 1.8, cuda 11.1torch cuda version - [NO][NO] fused_lamb.......sparse_attn....... ............[OKAY] ............. -[NO] [OKAY] [NO] - ............... 11.1 -nvcc version ..................... 11.2 -....... .......[OKAY] - [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformersparse_attn ........................ [NO][NO] sparse_attn ....... ....... sparse_attn[OKAY]............ -[OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -............[NO] [NO]transformerstochastic_transformer....... ....... ............ [OKAY] [OKAY] -.[NO] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [NO]....... transformer.......transformer[OKAY] [OKAY] -............ - ............[NO] [NO]....... stochastic_transformer ....... [OKAY] -[OKAY]. - [NO] .......stochastic_transformer stochastic_transformer [OKAY] -. .[NO] [NO]....... .......[OKAY] -[OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................. .................. .................. [OKAY].................. [OKAY] -[OKAY] - -[OKAY]---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op nameop name - ................................op name installedinstalled................ ....--------------------------------------------------installed - compatiblecompatible..op name - - compatible---------------------------------------------------------------------------------------------------- -................ - - --------------------------------------------------installed -cpu_adam ...............cpu_adam.. [YES]...............cpu_adamcompatible ......[YES]............... [OKAY]...... -[YES] - --------------------------------------------------......[OKAY] - -[OKAY] -fused_adam ............. [NO] .......fused_adam cpu_adam [OKAY]fused_adam -............. .............[NO] fused_lamb [NO]............... ....... ....................[YES] [OKAY] [OKAY] -[NO] - .............fused_lambfused_lamb [OKAY][OKAY]............. -............. -DeepSpeed general environment info: - [NO][NO] .............. [OKAY][OKAY] - -sparse_attn fused_adam............ [NO]sparse_attn sparse_attn ............................................ [NO][OKAY][NO][NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - .......transformer.............. ............[OKAY][OKAY] -[NO] -[OKAY] -transformertransformer....... ........................[OKAY]fused_lamb -torch version .................... 1.8.1 -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - - ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------op nameop name - - [NO][NO] .............. stochastic_transformer.............[OKAY][OKAY] - -torch cuda version ............... 11.1 - ................op name ................op name installed ................ installedinstalled.................. ....installedcompatible -[NO]. stochastic_transformer[NO]stochastic_transformer ............... .[OKAY][NO] - [OKAY][NO]....... -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - compatiblecompatible --------------------------------------------------- -..-------------------------------------------------- - --------------------------------------------------- -compatible --------------------------------------------------- - .......[OKAY] -[OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -cpu_adam ...............cpu_adam cpu_adam [YES] .................................... [YES]cpu_adam[YES][OKAY] -........................... [OKAY][YES][OKAY] - -...... [OKAY]fused_adam -sparse_attn ............ [NO] ....... [OKAY] - ............. [NO]fused_adam fused_adam.................... [OKAY].............[NO] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - fused_adam[NO]fused_lamb....... ....... ..........................[OKAY] [OKAY] -[NO] -[NO] .......fused_lamb fused_lamb [OKAY] .................... -............. [NO][NO] [OKAY] ....... -....... [OKAY][OKAY] -sparse_attn -fused_lamb ............ [NO]............. ....... [NO][OKAY] - sparse_attn.......transformer sparse_attn ............ ............ [OKAY][NO]............[NO] -....... [NO].......[OKAY] -[OKAY].......transformer - ............[OKAY] stochastic_transformer[NO] - ........sparse_attn transformer[OKAY][NO] - ............................... stochastic_transformer[NO][OKAY][NO] - ....... ........[OKAY] -[OKAY][NO] - .......stochastic_transformer [OKAY]transformer -. [NO]............ ....... [NO][OKAY] - ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -DeepSpeed general environment info: -torch install path ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - torch version .................... 1.8.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch cuda version torch version............... ....................11.1 -1.8.1 -nvcc version .....................torch cuda version 11.2............... - deepspeed install path11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -...........nvcc version ..................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inferenceasync_io .. ...............[NO] [NO]....... .......[OKAY] -[NO] -deepspeed infodeepspeed install path .............................. 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - deepspeed info...... ...................torch 1.8, cuda 11.1 -utils .................. [YES] ...... [OKAY] -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -transformer_inference quantizer.. [NO].............. .......[NO] [OKAY]....... - [OKAY] -utils-------------------------------------------------- -.................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -utils .................. [YES] ...... [OKAY] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1 -quantizer .............. [NO] ....... [OKAY] -nvcc version nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info --------------------------------------------------- - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -[OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name-------------------------------------------------- -op name -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -op name op name ................ ................ ................installed................ installed..installedinstalled ....compatible.. compatible - -compatible--------------------------------------------------compatible --------------------------------------------------- - --------------------------------------------------- - --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -cpu_adam ............... [YES]cpu_adamcpu_adamcpu_adam ..................... .............................. [OKAY] -[YES][YES][YES] .................. [OKAY][OKAY][OKAY] - - -fused_adam ............. [NO] ....... [OKAY] -fused_adamfused_lamb fused_adamfused_adam .......................... ............. .............[NO][NO][NO] ....... [NO] ....... ....... [OKAY].......[OKAY] - -[OKAY][OKAY] - -fused_lamb ............. fused_lambfused_lamb[NO] ............. ............. .......sparse_attn[NO] [NO][OKAY]................... - .......[OKAY][NO] - .......[OKAY] -[OKAY] -transformersparse_attn ............ ............[NO]sparse_attn sparse_attn...................[NO] [NO]............[OKAY] ....... -....... [OKAY][NO]stochastic_transformer -[OKAY] -.......transformer . ............ [OKAY]transformer [NO] - [NO]............ ....... transformer....... [NO] [OKAY] ................... - [OKAY] [NO] -[OKAY] -....... stochastic_transformer[OKAY] -stochastic_transformer . .[NO]stochastic_transformer [NO]....... . ....... [OKAY] [NO][OKAY] - -....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -DeepSpeed general environment info: -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch version .................... 1.8.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info: -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version ............... 11.1 -nvcc version torch cuda version..................... ...............11.2 -11.1 -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - deepspeed info...... ...................torch 1.8, cuda 11.1 - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -DeepSpeed general environment info: -utils .................. [YES] ......utils [OKAY].................. - [YES] quantizer...... ..............[OKAY] -[NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - - - -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 --------------------------------------------------- -JIT compiled ops requires ninja -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -DeepSpeed general environment info: -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - - -nvcc version ..................... 11.2 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -op nameop nameop nameop name ................................................................ installedinstalledinstalled installed .. .. ..compatible.. compatible - compatible ---------------------------------------------------compatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -cpu_adam ...............cpu_adam cpu_adam[YES]...............cpu_adam [YES] ..................... ............... ......[OKAY][YES] [YES] - [OKAY]...... -...... [OKAY][OKAY] - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -fused_adam ............. [NO]fused_adam fused_adam....... fused_adam.......................... [OKAY][NO].............[NO] - .......[NO] .......fused_lamb[OKAY] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - - ....................[OKAY]fused_lamb [NO] -[OKAY]............. -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] - fused_lamb[NO].......fused_lamb .................... [OKAY] ............. - [OKAY][NO][NO] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] - .............. [OKAY][OKAY] - -[OKAY] ----------------------------------------------------------------------------------------------------- - -sparse_attn ............ sparse_attn[NO] ................... sparse_attn[NO] [OKAY]sparse_attn....... ............ - ............transformer[OKAY][NO] - [NO] ....... ............ ....... transformer[OKAY][OKAY][NO] - - ...................transformer transformer [OKAY][NO] ............ - ............ ....... [NO]stochastic_transformer [NO] [OKAY] ....... -....... . [OKAY] [OKAY][NO] - -stochastic_transformer .......stochastic_transformer stochastic_transformer .[OKAY] -.[NO] . [NO].......[NO] ....... [OKAY] ....... -[OKAY] -[OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - -op name - op nameop name................op name ................................installed ................ installedinstalled .... installedcompatible..compatible - - ..--------------------------------------------------compatible-------------------------------------------------- - - -compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam ..............................cpu_adam [YES][YES]cpu_adam............... ..................... ......[YES][OKAY] [YES] -[OKAY] ...... -...... [OKAY][OKAY] - -fused_adam ............. fused_adam[NO] .............fused_adam fused_adam[NO]....... [OKAY]................................. - [OKAY][NO] -[NO]fused_lamb .......fused_lamb.................... [OKAY].............[NO] - [OKAY] .......[NO] fused_lamb[OKAY] - - ....... .............[OKAY]fused_lamb - [NO]............. .......[NO] [OKAY]....... - [OKAY]sparse_attn - ............ sparse_attn[NO] ................... [NO][OKAY]sparse_attn -....... ............transformer[OKAY]sparse_attn - [NO]........................ transformer .......[NO] [NO] ................... [OKAY] [OKAY]....... -[NO] - [OKAY].......transformer - stochastic_transformer [OKAY] -............. transformer [NO]stochastic_transformer [NO] ............ ........ ....... [NO][OKAY][NO][OKAY] ....... - - .......[OKAY] -stochastic_transformer[OKAY] -. [NO]stochastic_transformer ....... [OKAY]. - [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja --------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed general environment info: - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name -op name op name ................ op name................ ................ installed ................ installedinstalled .. installed.... compatiblecompatible..compatible - - -compatible-------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adam cpu_adamcpu_adam...............cpu_adam ...............[YES].............................. ......[YES][YES][YES] [OKAY]...... -............[OKAY] -[OKAY][OKAY] - -fused_adam fused_adam.............fused_adamfused_adam [NO]....................................... .......[NO][NO][NO] [OKAY].............. - ....... [OKAY] [OKAY][OKAY] -fused_lamb - - fused_lamb............. fused_lamb.............fused_lamb[NO] [NO].......................... ....... ....... [NO][NO][OKAY][OKAY] - -.............. [OKAY][OKAY] - -sparse_attnsparse_attn ............sparse_attn............ sparse_attn [NO][NO] ............ ............ ..............[NO][NO] [OKAY].......[OKAY]....... - -[OKAY] [OKAY] -transformer -transformer transformer ............ ............ transformer............ [NO] [NO] [NO] ............ .............. ....... [NO] [OKAY] [OKAY] -[OKAY]....... - -[OKAY]stochastic_transformer - stochastic_transformerstochastic_transformer . stochastic_transformer..[NO] [NO] .[NO] ....... ....... .......[NO][OKAY][OKAY] - -[OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -ninjaninjaninjaninja .................. .................. .................................... [OKAY] [OKAY][OKAY] - -[OKAY] --------------------------------------------------- - -------------------------------------------------------------------------------------------------------------------------------------------------------op name - - -op name................op name op name ................ installed................ ................ installed..installed installed....compatible -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................DeepSpeed general environment info: 1.8.1 - -..compatiblecompatible --------------------------------------------------- - -------------------------------------------------- --------------------------------------------------- -compatible - --------------------------------------------------- -cpu_adamcpu_adam cpu_adam............... ..............................[YES]cpu_adam [YES]...............[YES]...... ......[YES][OKAY]...... -torch cuda version ............... 11.1torch install path - nvcc version............... ..................... 11.2 -deepspeed install path ........... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed infotorch version ....................................... 0.4.2+bc17042, bc17042, big-science1.8.1 - -[OKAY][OKAY]...... - - [OKAY] -deepspeed wheel compiled w. torch cuda version...... ...............torch 1.8, cuda 11.1 -11.1 -nvcc version ..................... 11.2 -DeepSpeed general environment info:DeepSpeed general environment info: - -fused_adam ............. fused_adamfused_adam[NO] .............fused_adam.................... [NO] [NO]............. [OKAY] ....... -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -....... [NO] [OKAY] .......[OKAY]fused_lamb - - [OKAY]............. -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 -fused_lamb fused_lamb[NO]............. fused_lamb....................[NO] .......[OKAY] [NO] -............. [OKAY] ....... -[NO] [OKAY]....... - -torch cuda versiontorch cuda version .............................. 11.111.1 - - [OKAY] -nvcc versionnvcc version .......................................... 11.211.2 - -sparse_attn ............sparse_attn [NO]............ .......[NO] sparse_attn[OKAY] sparse_attn ....... - ............ ............ transformer[OKAY] [NO] - ............[NO]....... transformer [NO] ...................[OKAY] -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -[OKAY].......[NO] - transformer [OKAY] transformer....... -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -............ ............ stochastic_transformer [OKAY][NO][NO] - ............... stochastic_transformer[NO] [OKAY] [OKAY] - -........ [NO][OKAY] -stochastic_transformerstochastic_transformer....... [OKAY]. -. [NO] .......[NO] [OKAY]....... - [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version DeepSpeed general environment info:............... 11.1 - -nvcc version ..................... torch install path11.2 - deepspeed install path............... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']................... - 0.4.2+bc17042, bc17042, big-science -torch versiondeepspeed wheel compiled w. .......................... 1.8.1torch 1.8, cuda 11.1 - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. [NO] ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - --------------------------------------------------....... - [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ...............DeepSpeed general environment info: 11.1 -nvcc version - ..................... 11.2 -deepspeed install path torch install path........... ...............['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......async_io  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.[NO] -............... -[NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - ..async_io [NO] utils...................... ..................[NO][OKAY] -[YES]....... ......[NO] -[OKAY]utils - .................. quantizer[YES] .................... [NO][OKAY] -.......transformer_inference [OKAY]quantizer.. - ..............[NO] --------------------------------------------------[NO]....... - .......[OKAY] -[OKAY] ---------------------------------------------------utils - .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninja-------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -utils .................. [YES] ...... [OKAY] -torch cuda version ............... 11.1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -quantizer .............. [NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------- -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info: -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch version .................... 1.8.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninja ninja .................................... .................. .................. [OKAY] [OKAY] - [OKAY][OKAY] --------------------------------------------------- - --------------------------------------------------- -op name---------------------------------------------------------------------------------------------------- - - -................op name op nameop name installed .................................................. installedcompatible installed -installed..-------------------------------------------------- - ....compatible compatible -compatible - -----------------------------------------------------------------------------------------------------cpu_adam --------------------------------------------------- - -............... [YES] ...... [OKAY] -cpu_adamcpu_adam cpu_adam .............................. ...............[YES][YES]fused_adam [YES] ......................... ...... [OKAY] [NO] -[OKAY][OKAY] - -....... [OKAY] -fused_lambfused_adam .......................... [NO]fused_adam[NO] fused_adam....... .................................[OKAY] [OKAY] -[NO] -[NO] .......fused_lamb....... [OKAY] ............. -[OKAY] -[NO]sparse_attnfused_lamb ...................fused_lamb [OKAY] .......................... -[NO] [NO][NO]....... .......[OKAY]....... - [OKAY][OKAY]sparse_attn - - transformer............ ............[NO] [NO]....... .......[OKAY] -[OKAY] -transformersparse_attnstochastic_transformersparse_attn ............ ............. ............ [NO][NO] [NO][NO] ....... ..................... [OKAY][OKAY][OKAY][OKAY] - - - -transformertransformerstochastic_transformer ......................... [NO][NO][NO] ..................... [OKAY][OKAY] -[OKAY] - -stochastic_transformer . stochastic_transformer[NO] ........ [OKAY][NO] - ....... [OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -ninjaninjaninjaninja .................. ......................................................[OKAY] -[OKAY][OKAY][OKAY]-------------------------------------------------- - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- -op name - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - op nameop name ................op name ................ ................installed................ installed..installedinstalled .. compatible.. .. - compatible-------------------------------------------------- -compatiblecompatible - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adam ............... [YES] ......cpu_adam cpu_adam[OKAY]cpu_adam............... - ..............................[YES] [YES] ......[YES]...... [OKAY]fused_adam...... -[OKAY] -.............[OKAY] -[NO] ....... [OKAY] -fused_adam ............. fused_adamfused_lamb[NO] fused_adam.......................... ....... .............[NO][OKAY] -[NO] ....... [NO] fused_lamb.......[OKAY] -............. ....... [OKAY] [NO] - [OKAY]....... - fused_lamb[OKAY] -/bin/sh: line 0: type: git: not found -fused_lamb............. sparse_attn.............[NO] ............[NO]....... [NO] .......[OKAY].......sparse_attn - [OKAY][OKAY]............ - - [NO] transformer....... ............[OKAY] -[NO] sparse_attn....... [OKAY]transformer............ - sparse_attn ............ [NO] ............stochastic_transformer [NO] [NO].............. . .......[OKAY][OKAY][NO] - - [OKAY]....... -stochastic_transformer transformer [OKAY]transformer -. ............ ............ [NO] [NO] [NO]....... ..............[OKAY] -[OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - ....................torch cuda version 1.8.1............... - 11.1torch cuda version - nvcc version............... .....................11.1 -11.2nvcc version - deepspeed install path..................... ...........11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path -DeepSpeed general environment info: - ...........deepspeed info ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1DeepSpeed general environment info: -torch cuda version -............... 11.1 -nvcc versiontorch install path ..................... 11.2............... -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed info - ................... torch version0.4.2+bc17042, bc17042, big-science -....................deepspeed wheel compiled w. 1.8.1...... - torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.async_io ............... - [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] utils....... ..................[NO] -[YES] ...... [OKAY] -quantizer .............. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY]-------------------------------------------------- - -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - .....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -...................deepspeed info ...................0.4.2+bc17042, bc17042, big-science - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 - nvcc version11.1 -.....................nvcc version 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inferencetransformer_inference .. [NO] ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY]quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inferenceutils .................... [NO][YES] ............. [OKAY][OKAY] - -quantizer utils.............. ..................[NO] [YES]....... ......[OKAY] -[OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ------------------------------------------------------------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninja-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - --------------------------------------------------- --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name - -op name op nameop name ................................ ................ ................installed installed installedinstalled .. .... .. compatiblecompatiblecompatible - - -----------------------------------------------------------------------------------------------------compatible-------------------------------------------------- - - - --------------------------------------------------- -cpu_adam cpu_adamcpu_adam...............cpu_adam ............... ...............[YES]............... ......[YES][YES][YES] [OKAY] ...... -............ [OKAY][OKAY] -[OKAY] - -fused_adam ............. fused_adamfused_adam[NO] fused_adam .................... ..........................[NO] [OKAY] [NO] -/bin/sh: line 0: type: git: not found -[NO]....... ....... ....... fused_lamb [OKAY] [OKAY][OKAY] -............. - - [NO] .......fused_lambfused_lamb fused_lamb [OKAY] ............. -............. ............. [NO] [NO] [NO] ....... ..............[OKAY] -[OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn transformersparse_attn............sparse_attn [NO].................................... [NO].......[NO][NO] ....... .......[OKAY] ....... - [OKAY] -[OKAY]transformer[OKAY] - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -............ transformer[NO]stochastic_transformer transformer ................... . ............[NO][OKAY][NO] - [NO].............. .......stochastic_transformer[OKAY] -[OKAY] [OKAY] - -. [NO]stochastic_transformer stochastic_transformer ....... .[OKAY] . -[NO] [NO]....... .......[OKAY] -[OKAY] -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`................ [NO] - ....... [NO] -async_io ............... [NO] transformer_inference....... ..[NO] -async_io[NO] ...................... [OKAY][NO] - ....... [NO] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -transformer_inference quantizer.. ..............[NO]utils [NO]......................... .......[OKAY][YES] - [OKAY]...... - [OKAY] ---------------------------------------------------utils - quantizer.................. ..............[YES] [NO]...... .......[OKAY] -[OKAY] -quantizer ..............-------------------------------------------------- [NO] - ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1DeepSpeed general environment info: - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -/bin/sh: line 0: type: git: not found -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-scienceDeepSpeed general environment info: -deepspeed wheel compiled w. -...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version .................... 1.8.1 -async_io ............... [NO] ....... [NO] -torch cuda version ............... 11.1 -transformer_inference .. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -utils .................. [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer .............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - .................... torch cuda version1.8.1 -............... 11.1torch cuda version - nvcc version............... .....................11.1 -11.2nvcc version - deepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] async_io....... [NO]............... - [NO] ....... [NO] -transformer_inference ..transformer_inference [NO].. [NO]....... .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -async_io ............... [NO] ....... [NO] -DeepSpeed general environment info: -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference utils.. ..................[NO] [YES]....... ......[OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -[OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer utils.............. [NO].................. .......[YES] [OKAY]...... - [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -quantizer .............. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version .................... 1.8.1 -async_io ............... [NO] ....... [NO] -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -transformer_inference .. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY]utils - .................. [YES] quantizer...... [OKAY].............. - [NO] .......quantizer [OKAY].............. - [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -using world size: 256, data-parallel-size: 8, tensor-model-parallel size: 4, pipeline-model-parallel size: 8 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -using torch.float16 for parameters ... -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ------------------------- arguments ------------------------ - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] - accumulate_allreduce_grads_in_fp32 .............. False -transformer_inference .. [NO] ....... [OKAY] - adam_beta1 ...................................... 0.9 - adam_beta2 ...................................... 0.999 - adam_eps ........................................ 1e-08 -utils .................. [YES] ...... [OKAY] - adlr_autoresume ................................. False - adlr_autoresume_interval ........................ 1000 - apply_query_key_layer_scaling ................... True - apply_residual_connection_post_layernorm ........ False - attention_dropout ............................... 0.1 - attention_softmax_in_fp32 ....................... False - bert_binary_head ................................ True - bert_load ....................................... None - bf16 ............................................ False - bias_dropout_fusion ............................. True - bias_gelu_fusion ................................ True - biencoder_projection_dim ........................ 0 -quantizer .............. [NO] ....... [OKAY] - biencoder_shared_query_context_model ............ False - block_data_path ................................. None - checkpoint_activations .......................... True --------------------------------------------------- - checkpoint_in_cpu ............................... False - checkpoint_num_layers ........................... 1 - clip_grad ....................................... 1.0 - codecarbon_dir .................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/codecarbon - consumed_train_samples .......................... 0 - consumed_valid_samples .......................... 0 - contigious_checkpointing ........................ False - cpu_optimizer ................................... False - cpu_torch_adam .................................. False - data_impl ....................................... mmap - data_parallel_size .............................. 8 - data_path ....................................... ['/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document'] - dataloader_type ................................. single - DDP_impl ........................................ local - decoder_seq_length .............................. None - deepscale ....................................... False - deepscale_config ................................ None - deepspeed ....................................... True - deepspeed_activation_checkpointing .............. True - deepspeed_config ................................ ./ds_config.1271196.json - deepspeed_mpi ................................... False - distribute_checkpointed_activations ............. False - distributed_backend ............................. nccl - embedding_path .................................. None - encoder_seq_length .............................. 2048 - eod_mask_loss ................................... False - eval_interval ................................... 1000 - eval_iters ...................................... 5 - evidence_data_path .............................. None - exit_duration_in_mins ........................... 1190 - exit_interval ................................... None - ffn_hidden_size ................................. 20480 - finetune ........................................ False - fp16 ............................................ True - fp16_lm_cross_entropy ........................... False - fp32_residual_connection ........................ False - global_batch_size ............................... 2048 - hidden_dropout .................................. 0.1 - hidden_size ..................................... 16384 - hysteresis ...................................... 2 - ict_head_size ................................... None - ict_load ........................................ None - img_dim ......................................... 224 - indexer_batch_size .............................. 128 - indexer_log_interval ............................ 1000 - init_method_std ................................. 0.02 - init_method_xavier_uniform ...................... False - initial_loss_scale .............................. 4294967296 - kv_channels ..................................... 512 - layernorm_epsilon ............................... 1e-05 - lazy_mpu_init ................................... None - load ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - local_rank ...................................... 0 - log_batch_size_to_tensorboard ................... True - log_interval .................................... 10 - log_learning_rate_to_tensorboard ................ True - log_loss_scale_to_tensorboard ................... True - log_num_zeros_in_grad ........................... False - log_params_norm ................................. False - log_timers_to_tensorboard ....................... True - log_validation_ppl_to_tensorboard ............... True - loss_scale ...................................... 12.0 - loss_scale_window ............................... 1000 - lr .............................................. 6e-05 - lr_decay_iters .................................. None - lr_decay_samples ................................ 126953125 - lr_decay_style .................................. cosine - lr_warmup_fraction .............................. None - lr_warmup_iters ................................. 0 - lr_warmup_samples ............................... 216320 - make_vocab_size_divisible_by .................... 128 - mask_prob ....................................... 0.15 - masked_softmax_fusion ........................... True - max_position_embeddings ......................... 2048 - memory_centric_tiled_linear ..................... False - merge_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt - micro_batch_size ................................ 1 - min_loss_scale .................................. 1.0 - min_lr .......................................... 6e-06 - mmap_warmup ..................................... False - no_load_optim ................................... None - no_load_rng ..................................... None - no_save_optim ................................... None - no_save_rng ..................................... None - num_attention_heads ............................. 32 - num_channels .................................... 3 - num_classes ..................................... 1000 - num_layers ...................................... 32 - num_layers_per_virtual_pipeline_stage ........... None - num_workers ..................................... 2 - onnx_safe ....................................... None - openai_gelu ..................................... False - optimizer ....................................... adam - override_lr_scheduler ........................... False - params_dtype .................................... torch.float16 - partition_activations ........................... False - patch_dim ....................................... 16 - pipeline_model_parallel_size .................... 8 - position_embedding_type ......................... PositionEmbeddingType.absolute - profile_backward ................................ False - query_in_block_prob ............................. 0.1 - rampup_batch_size ............................... ['16', '16', '6_000_000'] - rank ............................................ 0 - remote_device ................................... none - reset_attention_mask ............................ False - reset_position_ids .............................. False - retriever_report_topk_accuracies ................ [] - retriever_score_scaling ......................... False - retriever_seq_length ............................ 256 - sample_rate ..................................... 1.0 - save ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - save_interval ................................... 1500 - scatter_gather_tensors_in_pipeline .............. True - scattered_embeddings ............................ False - seed ............................................ 43 - seq_length ...................................... 2048 - sgd_momentum .................................... 0.9 - short_seq_prob .................................. 0.1 - split ........................................... 949,50,1 - split_transformers .............................. False - synchronize_each_layer .......................... False - tensor_model_parallel_size ...................... 4 - tensorboard_dir ................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/tensorboard - tensorboard_log_interval ........................ 1 - tensorboard_queue_size .......................... 5 - tile_factor ..................................... 1 - titles_data_path ................................ None - tokenizer_name_or_path .......................... None - tokenizer_type .................................. GPT2BPETokenizer - train_iters ..................................... None - train_samples ................................... 300000000 - use_checkpoint_lr_scheduler ..................... False - use_contiguous_buffers_in_ddp ................... False - use_cpu_initialization .......................... None - use_one_sent_docs ............................... False - use_pin_memory .................................. False - virtual_pipeline_model_parallel_size ............ None - vocab_extra_ids ................................. 0 - vocab_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json - weight_decay .................................... 0.1 - world_size ...................................... 256 - zero_allgather_bucket_size ...................... 0.0 - zero_contigious_gradients ....................... False - zero_reduce_bucket_size ......................... 0.0 - zero_reduce_scatter ............................. False - zero_stage ...................................... 1 --------------------- end of arguments --------------------- -will use batch size rampup starting from global batch size 16 to global batch size 2048 with batch size increments 16 over 6000000 samples. -> building GPT2BPETokenizer tokenizer ... -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -/bin/sh: line 0: type: git: not found -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version .................... 1.8.1.................... - 1.8.1torch cuda version - ............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info .............................. 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - deepspeed info...... ...................torch 1.8, cuda 11.1 0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -torch install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - ...............torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch cuda version - ...............torch version 11.1.................... - nvcc version1.8.1 -..................... torch cuda version11.2 -...............deepspeed install path 11.1........... - nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - deepspeed info11.2 -...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...... deepspeed infotorch 1.8, cuda 11.1 ................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version DeepSpeed general environment info:............... 11.1 - -nvcc version ..................... 11.2torch install path -deepspeed install path .......................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ...................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.torch version .......................... torch 1.8, cuda 11.11.8.1 - -DeepSpeed general environment info:torch cuda version ............... - 11.1 -nvcc version .....................torch install path 11.2............... - deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed info - ................... torch version0.4.2+bc17042, bc17042, big-science -.................... deepspeed wheel compiled w.1.8.1 -...... torch cuda versiontorch 1.8, cuda 11.1 -............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version torch cuda version.................... ...............1.8.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -11.1torch cuda version - nvcc version............... .....................11.1 -11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info - deepspeed wheel compiled w.................... ......0.4.2+bc17042, bc17042, big-science - torch 1.8, cuda 11.1deepspeed wheel compiled w. - ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -torch install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -............... torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch cuda version ...............torch version 11.1.................... - nvcc version1.8.1 -..................... 11.2torch cuda version - deepspeed install path............... ...........11.1 -nvcc version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -.....................deepspeed info 11.2................... - deepspeed install path0.4.2+bc17042, bc17042, big-science -...........deepspeed wheel compiled w. ......['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ......quantizer [OKAY].............. - [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY] -quantizer .............. [NO]quantizer ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ................... ................... 0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']DeepSpeed general environment info: - -torch version .................... 1.8.1 -torch install path torch cuda version............... ............... 11.1 -nvcc version ..................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.2 - -deepspeed install path torch version........... ....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -1.8.1 -deepspeed info ...................torch cuda version 0.4.2+bc17042, bc17042, big-science -............... deepspeed wheel compiled w.11.1 -...... nvcc versiontorch 1.8, cuda 11.1 -..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninja ninja ...................................................... .................. [OKAY][OKAY] - -[OKAY]--------------------------------------------------[OKAY]-------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------op nameop name - -op name ................op name ................ ................................installedinstalled installedinstalled.. .. .. ..compatiblecompatiblecompatible - -compatible-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adam [YES] cpu_adam............... ............... ...... ...............[YES][YES] [OKAY] -[YES] ............ [OKAY]......[OKAY] - -fused_adam[OKAY] ............. - [NO] ....... fused_adam[OKAY] -fused_adam............. .............fused_adam fused_lamb .............[NO][NO] ............. [NO] .............. [NO] ....... [OKAY] [OKAY] ....... - -[OKAY] -fused_lamb[OKAY]fused_lamb -............. fused_lamb ............. [NO] ............. [NO] .......[NO]....... sparse_attn .......[OKAY][OKAY] - -[OKAY]............ - [NO] ....... [OKAY] -transformer ............ [NO]sparse_attnsparse_attn sparse_attn............................... ............[NO] [NO][OKAY] [NO] ....... -.......stochastic_transformer ....... [OKAY][OKAY]. -[OKAY] - -transformer[NO] transformer ................... transformer[NO]............ [OKAY]............ -.......[NO] [NO][OKAY]....... - .......[OKAY] -stochastic_transformer[OKAY] -.stochastic_transformer stochastic_transformer[NO] ........ . [NO] [OKAY] -.......[NO] [OKAY]....... - [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - ....................torch cuda version 1.8.1............... - 11.1 -torch cuda versionnvcc version ............... 11.1 -nvcc version ..................... 11.2 -.....................deepspeed install path 11.2........... - ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [NO] -....... [OKAY] --------------------------------------------------- [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO] -async_io ............... [NO]transformer_inference ......... [NO][NO] -....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -utils .................. transformer_inference[YES] ........ [NO][OKAY] -....... [OKAY] -quantizer .............. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1DeepSpeed general environment info: -nvcc version -..................... 11.2 -DeepSpeed general environment info: -deepspeed install pathtorch install path ........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed info ................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -0.4.2+bc17042, bc17042, big-science -torch versiondeepspeed wheel compiled w. .......................... 1.8.1torch 1.8, cuda 11.1 - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -nvcc versiontorch cuda version .................................... 11.211.1 - -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1torch install path - torch cuda version............... ............... 11.1 -nvcc version ..................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']11.2 - -deepspeed install pathtorch version ............................... 1.8.1['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infotorch cuda version .................................. 0.4.2+bc17042, bc17042, big-science11.1 - -deepspeed wheel compiled w.nvcc version ........................... torch 1.8, cuda 11.111.2 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... DeepSpeed general environment info:['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.torch install path ...... torch 1.8, cuda 11.1............... - ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info:DeepSpeed general environment info: - - -torch install pathtorch install pathtorch install path ............... ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - -torch versiontorch version torch version........................................ ....................1.8.11.8.1 - -1.8.1 -torch cuda versiontorch cuda version torch cuda version ............... ............... ............... 11.1 11.1 -11.1 - -nvcc versionnvcc version nvcc version ..................... ..................... ..................... 11.2 11.2 -11.2 -deepspeed install path -deepspeed install path deepspeed install path ........... ........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - - deepspeed infodeepspeed info................... ......................................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - - deepspeed wheel compiled w.deepspeed wheel compiled w....... ............torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc versionDeepSpeed general environment info: ..................... 11.2 - -deepspeed install path ........... torch install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...............deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - ...........deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info ...................0.4.2+bc17042, bc17042, big-science - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja - -JIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed info deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -ninjaninjaninja ninja ...................................................... [OKAY][OKAY][OKAY] - - -..................---------------------------------------------------------------------------------------------------- -------------------------------------------------- - -[OKAY] - -op nameop name op name ................-------------------------------------------------- ................ ................ - installed op nameinstalled installed .................. .... compatible installed -compatiblecompatible -..-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam...............compatiblecpu_adam - ...............--------------------------------------------------...............[YES] ......[YES] - [OKAY]......[YES] - [OKAY]...... - [OKAY] -cpu_adam ............... [YES] ......fused_adam [OKAY].............fused_adamfused_adam - .............[NO]............. .......[NO][NO] [OKAY]....... -....... [OKAY][OKAY] -fused_adamfused_lamb - .......................... fused_lamb [NO][NO] fused_lamb ............. ...........................[NO] [OKAY][OKAY].......[NO] - ....... - [OKAY][OKAY] -fused_lamb - ............. [NO] ....... sparse_attn[OKAY] -............ [NO] .......sparse_attn sparse_attn ............ [OKAY]............ - [NO][NO]transformer sparse_attn.......................... [OKAY][NO][OKAY] - ............ -.......transformer transformer[OKAY][NO]............ - ............ [NO] .......[NO]stochastic_transformer....... ........[OKAY] -[OKAY][NO] -[OKAY] -.......stochastic_transformer [OKAY]transformer -stochastic_transformer . ............ .[NO] .......[NO][NO] [OKAY].............. - [OKAY] - [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils .................. [YES]utils ........................ [OKAY][YES] - ...... [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info:nvcc version ..................... -11.2 -deepspeed install path ...........torch install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']............... - deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']...... - torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................................... .................. ..................[OKAY] [OKAY] [OKAY] -[OKAY] - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -op nameop name op nameop name ................................ installed................................installed ..installedinstalled.. compatible.... -compatible -compatible--------------------------------------------------compatible --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... cpu_adam cpu_adam...............[YES] ..............................[YES]...... [YES][OKAY][YES]...... - ............[OKAY] -[OKAY][OKAY] - -fused_adam .............fused_adamfused_adam [NO].............fused_adam............. .......[NO].............[NO] [OKAY][NO]....... - ....... ....... [OKAY] [OKAY] -fused_lamb[OKAY] - -............. fused_lamb[NO]fused_lamb fused_lamb .......................... ....... .............[NO][NO] [OKAY] [NO] -.............. .......[OKAY][OKAY] - -[OKAY] -sparse_attn ............ [NO] sparse_attnsparse_attn....... sparse_attn ............ ............[OKAY] ............ - [NO][NO][NO] transformer..................... ............[OKAY][OKAY][OKAY] - - -[NO] .......transformer transformer [OKAY] ............transformer -............ [NO][NO]............ stochastic_transformer....... ....... [NO] [OKAY] [OKAY] -........ - [NO][OKAY] - stochastic_transformer.......stochastic_transformer [OKAY]stochastic_transformer -. . [NO][NO] . ....... ....... [NO] [OKAY][OKAY] - -....... [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version DeepSpeed general environment info:............... 11.1 - -nvcc version ..................... 11.2torch install path - deepspeed install path............... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] ................... - 0.4.2+bc17042, bc17042, big-science -torch versiondeepspeed wheel compiled w. .......................... 1.8.1torch 1.8, cuda 11.1 - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference .. transformer_inference[NO] ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. [YES] quantizer...... ..............[OKAY] -[NO] ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1torch cuda version - ...............torch cuda version 11.1............... - nvcc version11.1 -.....................nvcc version 11.2..................... - deepspeed install path11.2 -........... deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -................... deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................................... .................. .................. [OKAY] [OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop name op nameop name ................ ................................................installed installed installedinstalled .... .. ..compatible -compatiblecompatible -compatible-------------------------------------------------- --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam cpu_adam...............cpu_adam............... ...............[YES]...............[YES] [YES][YES] ........................ [OKAY][OKAY][OKAY] -[OKAY] - - -fused_adam fused_adam.............fused_adamfused_adam [NO] ....................................... ....... [NO] [NO][NO] [OKAY]..................... - [OKAY][OKAY][OKAY]fused_lamb - - - .............fused_lamb fused_lambfused_lamb [NO] .............................................. [NO][OKAY] [NO][NO]....... - .......[OKAY]....... - [OKAY] -[OKAY] -sparse_attn ............ [NO] ....... [OKAY] -sparse_attn sparse_attntransformer............sparse_attn ........................[NO]............ ....... [NO][NO][NO] [OKAY].............. - ....... [OKAY] [OKAY] -transformer[OKAY] - -............transformerstochastic_transformer transformer [NO]............ ....................[NO] [NO][OKAY].......[NO] - ....... [OKAY] ....... -[OKAY] -[OKAY]stochastic_transformer - stochastic_transformer .stochastic_transformer .[NO] . .......[NO] [NO][OKAY]....... - .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [NO] ....... - [OKAY] -utils .................. [YES] ...... [OKAY] -quantizerasync_io ............................. [NO][NO] .............. [OKAY][NO] - --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. -...... deepspeed wheel compiled w.torch 1.8, cuda 11.1 ...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1 -nvcc version nvcc version..................... .....................11.2 -11.2 -deepspeed install path deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info deepspeed info................... ................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - -async_ioasync_io ............... ...............async_io[NO] [NO] ............................. [NO][NO][NO] - -....... [NO] -transformer_inferencetransformer_inference ..transformer_inference .. [NO] .. [NO] ....... [NO] ....... [OKAY] ....... -[OKAY] -[OKAY] -utilsutils utils .................. .................. .................. [YES] [YES] [YES] ...... ...... ...... [OKAY] [OKAY] -[OKAY] - -quantizerquantizer quantizer............................ ..............[NO][NO] [NO].............. [OKAY][OKAY]....... - - [OKAY] ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch versiontorch install path .................... ...............1.8.1 -torch cuda version ............... 11.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -nvcc version .....................torch version 11.2.................... - deepspeed install path1.8.1 -........... torch cuda version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']............... - deepspeed info11.1 -...................nvcc version .....................0.4.2+bc17042, bc17042, big-science -11.2 -deepspeed wheel compiled w. deepspeed install path...... ...........torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -> setting codecarbon ... -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO][NO] .............. [OKAY][OKAY] - -fused_lamb .............fused_lamb [NO]............. .......[NO] [OKAY]....... - [OKAY] -sparse_attn ............sparse_attn [NO]............ .......[NO] [OKAY] -....... [OKAY] -transformertransformer ........................ [NO][NO] .............. [OKAY][OKAY] - -stochastic_transformer stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO]ninja ....... ..................[OKAY] -[OKAY] --------------------------------------------------- -op name ................ installed .. compatiblesparse_attn - --------------------------------------------------............ - [NO] ....... [OKAY] -transformercpu_adam ............ ...............[NO] [YES]....... ......[OKAY] -[OKAY] -stochastic_transformer . [NO] ....... fused_adam[OKAY] -............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. transformer_inference[NO] ......... [NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. [NO] .......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO] -[NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch install path - ...............torch version .................... 1.8.1 -torch cuda version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -............... 11.1 -torch version nvcc version.................... .....................1.8.1 -11.2 -torch cuda versiondeepspeed install path .......................... 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']nvcc version -.....................deepspeed info 11.2................... - deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] ...... - deepspeed infotorch 1.8, cuda 11.1 -................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op name --------------------------------------------------op nameop name................ -................ op name installedinstalled................ .. .................. installed compatiblecompatible -installed -.. ---------------------------------------------------------------------------------------------------- .. - -compatible -compatible ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam .............................. [YES][YES] cpu_adamcpu_adam...... .....................[OKAY]............... - [YES][OKAY][YES] - ............ [OKAY][OKAY] - -fused_adam ............. [NO] .......fused_adam [OKAY]............. -fused_adam fused_adam [NO] ..........................fused_lamb....... [OKAY][NO].............[NO] - [NO]....... .......fused_lamb....... [OKAY][OKAY][OKAY]............. - - -[NO] fused_lamb.......fused_lamb [OKAY].......................... - [NO][NO] .............. sparse_attn[OKAY][OKAY] - -............ [NO] ....... sparse_attn[OKAY] -............ [NO] transformer....... ............[OKAY]sparse_attn -sparse_attn [NO]............ transformer................... [NO][NO]............[OKAY] -..............[NO] [OKAY]stochastic_transformer[OKAY] -....... - [OKAY]. -transformer transformer [NO] stochastic_transformer ............ .............[NO]....... [NO][NO][OKAY] -..................... [OKAY][OKAY][OKAY] - - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -> initializing torch distributed ... -> setting tensorboard ... -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... 11.2..................... - deepspeed install path11.2 -...........deepspeed install path ...........['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> initializing tensor model parallel with size 4 -> initializing pipeline model parallel with size 8 -> setting random seeds to 43 ... -[2021-09-27 17:43:40,715] [INFO] [checkpointing.py:226:model_parallel_cuda_manual_seed] > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 2761 and data parallel seed: 43 -> compiling dataset index builder ... -make: Entering directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' -make: Nothing to be done for 'default'. -make: Leaving directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' ->>> done with dataset index builder. Compilation time: 0.304 seconds -> compiling and loading fused kernels ... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_upper_triang_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_upper_triang_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module fused_mix_prec_layer_norm_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module fused_mix_prec_layer_norm_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( ->>> done with compiling and loading fused kernels. Compilation time: 22.376 seconds -time to initialize megatron (seconds): 67.410 -[after megatron is initialized] datetime: 2021-09-27 17:44:03 -building GPT model ... -[2021-09-27 17:44:03,479] [INFO] [utils.py:680:see_memory_usage] Before Building Model -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:373: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved - warnings.warn( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:381: FutureWarning: torch.cuda.max_memory_cached has been renamed to torch.cuda.max_memory_reserved - warnings.warn( -[2021-09-27 17:44:03,481] [INFO] [utils.py:681:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB -[2021-09-27 17:44:03,481] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 37.48 GB, percent = 20.0% -SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None -Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=1, data=0, model=0): 32, ProcessCoord(pipe=1, data=0, model=1): 33, ProcessCoord(pipe=1, data=0, model=2): 34, ProcessCoord(pipe=1, data=0, model=3): 35, ProcessCoord(pipe=1, data=1, model=0): 36, ProcessCoord(pipe=1, data=1, model=1): 37, ProcessCoord(pipe=1, data=1, model=2): 38, ProcessCoord(pipe=1, data=1, model=3): 39, ProcessCoord(pipe=1, data=2, model=0): 40, ProcessCoord(pipe=1, data=2, model=1): 41, ProcessCoord(pipe=1, data=2, model=2): 42, ProcessCoord(pipe=1, data=2, model=3): 43, ProcessCoord(pipe=1, data=3, model=0): 44, ProcessCoord(pipe=1, data=3, model=1): 45, ProcessCoord(pipe=1, data=3, model=2): 46, ProcessCoord(pipe=1, data=3, model=3): 47, ProcessCoord(pipe=1, data=4, model=0): 48, ProcessCoord(pipe=1, data=4, model=1): 49, ProcessCoord(pipe=1, data=4, model=2): 50, ProcessCoord(pipe=1, data=4, model=3): 51, ProcessCoord(pipe=1, data=5, model=0): 52, ProcessCoord(pipe=1, data=5, model=1): 53, ProcessCoord(pipe=1, data=5, model=2): 54, ProcessCoord(pipe=1, data=5, model=3): 55, ProcessCoord(pipe=1, data=6, model=0): 56, ProcessCoord(pipe=1, data=6, model=1): 57, ProcessCoord(pipe=1, data=6, model=2): 58, ProcessCoord(pipe=1, data=6, model=3): 59, ProcessCoord(pipe=1, data=7, model=0): 60, ProcessCoord(pipe=1, data=7, model=1): 61, ProcessCoord(pipe=1, data=7, model=2): 62, ProcessCoord(pipe=1, data=7, model=3): 63, ProcessCoord(pipe=2, data=0, model=0): 64, ProcessCoord(pipe=2, data=0, model=1): 65, ProcessCoord(pipe=2, data=0, model=2): 66, ProcessCoord(pipe=2, data=0, model=3): 67, ProcessCoord(pipe=2, data=1, model=0): 68, ProcessCoord(pipe=2, data=1, model=1): 69, ProcessCoord(pipe=2, data=1, model=2): 70, ProcessCoord(pipe=2, data=1, model=3): 71, ProcessCoord(pipe=2, data=2, model=0): 72, ProcessCoord(pipe=2, data=2, model=1): 73, ProcessCoord(pipe=2, data=2, model=2): 74, ProcessCoord(pipe=2, data=2, model=3): 75, ProcessCoord(pipe=2, data=3, model=0): 76, ProcessCoord(pipe=2, data=3, model=1): 77, ProcessCoord(pipe=2, data=3, model=2): 78, ProcessCoord(pipe=2, data=3, model=3): 79, ProcessCoord(pipe=2, data=4, model=0): 80, ProcessCoord(pipe=2, data=4, model=1): 81, ProcessCoord(pipe=2, data=4, model=2): 82, ProcessCoord(pipe=2, data=4, model=3): 83, ProcessCoord(pipe=2, data=5, model=0): 84, ProcessCoord(pipe=2, data=5, model=1): 85, ProcessCoord(pipe=2, data=5, model=2): 86, ProcessCoord(pipe=2, data=5, model=3): 87, ProcessCoord(pipe=2, data=6, model=0): 88, ProcessCoord(pipe=2, data=6, model=1): 89, ProcessCoord(pipe=2, data=6, model=2): 90, ProcessCoord(pipe=2, data=6, model=3): 91, ProcessCoord(pipe=2, data=7, model=0): 92, ProcessCoord(pipe=2, data=7, model=1): 93, ProcessCoord(pipe=2, data=7, model=2): 94, ProcessCoord(pipe=2, data=7, model=3): 95, ProcessCoord(pipe=3, data=0, model=0): 96, ProcessCoord(pipe=3, data=0, model=1): 97, ProcessCoord(pipe=3, data=0, model=2): 98, ProcessCoord(pipe=3, data=0, model=3): 99, ProcessCoord(pipe=3, data=1, model=0): 100, ProcessCoord(pipe=3, data=1, model=1): 101, ProcessCoord(pipe=3, data=1, model=2): 102, ProcessCoord(pipe=3, data=1, model=3): 103, ProcessCoord(pipe=3, data=2, model=0): 104, ProcessCoord(pipe=3, data=2, model=1): 105, ProcessCoord(pipe=3, data=2, model=2): 106, ProcessCoord(pipe=3, data=2, model=3): 107, ProcessCoord(pipe=3, data=3, model=0): 108, ProcessCoord(pipe=3, data=3, model=1): 109, ProcessCoord(pipe=3, data=3, model=2): 110, ProcessCoord(pipe=3, data=3, model=3): 111, ProcessCoord(pipe=3, data=4, model=0): 112, ProcessCoord(pipe=3, data=4, model=1): 113, ProcessCoord(pipe=3, data=4, model=2): 114, ProcessCoord(pipe=3, data=4, model=3): 115, ProcessCoord(pipe=3, data=5, model=0): 116, ProcessCoord(pipe=3, data=5, model=1): 117, ProcessCoord(pipe=3, data=5, model=2): 118, ProcessCoord(pipe=3, data=5, model=3): 119, ProcessCoord(pipe=3, data=6, model=0): 120, ProcessCoord(pipe=3, data=6, model=1): 121, ProcessCoord(pipe=3, data=6, model=2): 122, ProcessCoord(pipe=3, data=6, model=3): 123, ProcessCoord(pipe=3, data=7, model=0): 124, ProcessCoord(pipe=3, data=7, model=1): 125, ProcessCoord(pipe=3, data=7, model=2): 126, ProcessCoord(pipe=3, data=7, model=3): 127, ProcessCoord(pipe=4, data=0, model=0): 128, ProcessCoord(pipe=4, data=0, model=1): 129, ProcessCoord(pipe=4, data=0, model=2): 130, ProcessCoord(pipe=4, data=0, model=3): 131, ProcessCoord(pipe=4, data=1, model=0): 132, ProcessCoord(pipe=4, data=1, model=1): 133, ProcessCoord(pipe=4, data=1, model=2): 134, ProcessCoord(pipe=4, data=1, model=3): 135, ProcessCoord(pipe=4, data=2, model=0): 136, ProcessCoord(pipe=4, data=2, model=1): 137, ProcessCoord(pipe=4, data=2, model=2): 138, ProcessCoord(pipe=4, data=2, model=3): 139, ProcessCoord(pipe=4, data=3, model=0): 140, ProcessCoord(pipe=4, data=3, model=1): 141, ProcessCoord(pipe=4, data=3, model=2): 142, ProcessCoord(pipe=4, data=3, model=3): 143, ProcessCoord(pipe=4, data=4, model=0): 144, ProcessCoord(pipe=4, data=4, model=1): 145, ProcessCoord(pipe=4, data=4, model=2): 146, ProcessCoord(pipe=4, data=4, model=3): 147, ProcessCoord(pipe=4, data=5, model=0): 148, ProcessCoord(pipe=4, data=5, model=1): 149, ProcessCoord(pipe=4, data=5, model=2): 150, ProcessCoord(pipe=4, data=5, model=3): 151, ProcessCoord(pipe=4, data=6, model=0): 152, ProcessCoord(pipe=4, data=6, model=1): 153, ProcessCoord(pipe=4, data=6, model=2): 154, ProcessCoord(pipe=4, data=6, model=3): 155, ProcessCoord(pipe=4, data=7, model=0): 156, ProcessCoord(pipe=4, data=7, model=1): 157, ProcessCoord(pipe=4, data=7, model=2): 158, ProcessCoord(pipe=4, data=7, model=3): 159, ProcessCoord(pipe=5, data=0, model=0): 160, ProcessCoord(pipe=5, data=0, model=1): 161, ProcessCoord(pipe=5, data=0, model=2): 162, ProcessCoord(pipe=5, data=0, model=3): 163, ProcessCoord(pipe=5, data=1, model=0): 164, ProcessCoord(pipe=5, data=1, model=1): 165, ProcessCoord(pipe=5, data=1, model=2): 166, ProcessCoord(pipe=5, data=1, model=3): 167, ProcessCoord(pipe=5, data=2, model=0): 168, ProcessCoord(pipe=5, data=2, model=1): 169, ProcessCoord(pipe=5, data=2, model=2): 170, ProcessCoord(pipe=5, data=2, model=3): 171, ProcessCoord(pipe=5, data=3, model=0): 172, ProcessCoord(pipe=5, data=3, model=1): 173, ProcessCoord(pipe=5, data=3, model=2): 174, ProcessCoord(pipe=5, data=3, model=3): 175, ProcessCoord(pipe=5, data=4, model=0): 176, ProcessCoord(pipe=5, data=4, model=1): 177, ProcessCoord(pipe=5, data=4, model=2): 178, ProcessCoord(pipe=5, data=4, model=3): 179, ProcessCoord(pipe=5, data=5, model=0): 180, ProcessCoord(pipe=5, data=5, model=1): 181, ProcessCoord(pipe=5, data=5, model=2): 182, ProcessCoord(pipe=5, data=5, model=3): 183, ProcessCoord(pipe=5, data=6, model=0): 184, ProcessCoord(pipe=5, data=6, model=1): 185, ProcessCoord(pipe=5, data=6, model=2): 186, ProcessCoord(pipe=5, data=6, model=3): 187, ProcessCoord(pipe=5, data=7, model=0): 188, ProcessCoord(pipe=5, data=7, model=1): 189, ProcessCoord(pipe=5, data=7, model=2): 190, ProcessCoord(pipe=5, data=7, model=3): 191, ProcessCoord(pipe=6, data=0, model=0): 192, ProcessCoord(pipe=6, data=0, model=1): 193, ProcessCoord(pipe=6, data=0, model=2): 194, ProcessCoord(pipe=6, data=0, model=3): 195, ProcessCoord(pipe=6, data=1, model=0): 196, ProcessCoord(pipe=6, data=1, model=1): 197, ProcessCoord(pipe=6, data=1, model=2): 198, ProcessCoord(pipe=6, data=1, model=3): 199, ProcessCoord(pipe=6, data=2, model=0): 200, ProcessCoord(pipe=6, data=2, model=1): 201, ProcessCoord(pipe=6, data=2, model=2): 202, ProcessCoord(pipe=6, data=2, model=3): 203, ProcessCoord(pipe=6, data=3, model=0): 204, ProcessCoord(pipe=6, data=3, model=1): 205, ProcessCoord(pipe=6, data=3, model=2): 206, ProcessCoord(pipe=6, data=3, model=3): 207, ProcessCoord(pipe=6, data=4, model=0): 208, ProcessCoord(pipe=6, data=4, model=1): 209, ProcessCoord(pipe=6, data=4, model=2): 210, ProcessCoord(pipe=6, data=4, model=3): 211, ProcessCoord(pipe=6, data=5, model=0): 212, ProcessCoord(pipe=6, data=5, model=1): 213, ProcessCoord(pipe=6, data=5, model=2): 214, ProcessCoord(pipe=6, data=5, model=3): 215, ProcessCoord(pipe=6, data=6, model=0): 216, ProcessCoord(pipe=6, data=6, model=1): 217, ProcessCoord(pipe=6, data=6, model=2): 218, ProcessCoord(pipe=6, data=6, model=3): 219, ProcessCoord(pipe=6, data=7, model=0): 220, ProcessCoord(pipe=6, data=7, model=1): 221, ProcessCoord(pipe=6, data=7, model=2): 222, ProcessCoord(pipe=6, data=7, model=3): 223, ProcessCoord(pipe=7, data=0, model=0): 224, ProcessCoord(pipe=7, data=0, model=1): 225, ProcessCoord(pipe=7, data=0, model=2): 226, ProcessCoord(pipe=7, data=0, model=3): 227, ProcessCoord(pipe=7, data=1, model=0): 228, ProcessCoord(pipe=7, data=1, model=1): 229, ProcessCoord(pipe=7, data=1, model=2): 230, ProcessCoord(pipe=7, data=1, model=3): 231, ProcessCoord(pipe=7, data=2, model=0): 232, ProcessCoord(pipe=7, data=2, model=1): 233, ProcessCoord(pipe=7, data=2, model=2): 234, ProcessCoord(pipe=7, data=2, model=3): 235, ProcessCoord(pipe=7, data=3, model=0): 236, ProcessCoord(pipe=7, data=3, model=1): 237, ProcessCoord(pipe=7, data=3, model=2): 238, ProcessCoord(pipe=7, data=3, model=3): 239, ProcessCoord(pipe=7, data=4, model=0): 240, ProcessCoord(pipe=7, data=4, model=1): 241, ProcessCoord(pipe=7, data=4, model=2): 242, ProcessCoord(pipe=7, data=4, model=3): 243, ProcessCoord(pipe=7, data=5, model=0): 244, ProcessCoord(pipe=7, data=5, model=1): 245, ProcessCoord(pipe=7, data=5, model=2): 246, ProcessCoord(pipe=7, data=5, model=3): 247, ProcessCoord(pipe=7, data=6, model=0): 248, ProcessCoord(pipe=7, data=6, model=1): 249, ProcessCoord(pipe=7, data=6, model=2): 250, ProcessCoord(pipe=7, data=6, model=3): 251, ProcessCoord(pipe=7, data=7, model=0): 252, ProcessCoord(pipe=7, data=7, model=1): 253, ProcessCoord(pipe=7, data=7, model=2): 254, ProcessCoord(pipe=7, data=7, model=3): 255} -[2021-09-27 17:44:04,887] [INFO] [module.py:360:_partition_layers] Partitioning pipeline stages with method type:transformer -stage=0 layers=7 - 0: _to_float16 - 1: EmbeddingPipe - 2: - 3: ParallelTransformerLayerPipe - 4: ParallelTransformerLayerPipe - 5: ParallelTransformerLayerPipe - 6: ParallelTransformerLayerPipe -stage=1 layers=4 - 7: ParallelTransformerLayerPipe - 8: ParallelTransformerLayerPipe - 9: ParallelTransformerLayerPipe - 10: ParallelTransformerLayerPipe -stage=2 layers=4 - 11: ParallelTransformerLayerPipe - 12: ParallelTransformerLayerPipe - 13: ParallelTransformerLayerPipe - 14: ParallelTransformerLayerPipe -stage=3 layers=4 - 15: ParallelTransformerLayerPipe - 16: ParallelTransformerLayerPipe - 17: ParallelTransformerLayerPipe - 18: ParallelTransformerLayerPipe -stage=4 layers=4 - 19: ParallelTransformerLayerPipe - 20: ParallelTransformerLayerPipe - 21: ParallelTransformerLayerPipe - 22: ParallelTransformerLayerPipe -stage=5 layers=4 - 23: ParallelTransformerLayerPipe - 24: ParallelTransformerLayerPipe - 25: ParallelTransformerLayerPipe - 26: ParallelTransformerLayerPipe -stage=6 layers=4 - 27: ParallelTransformerLayerPipe - 28: ParallelTransformerLayerPipe - 29: ParallelTransformerLayerPipe - 30: ParallelTransformerLayerPipe -stage=7 layers=8 - 31: ParallelTransformerLayerPipe - 32: ParallelTransformerLayerPipe - 33: ParallelTransformerLayerPipe - 34: ParallelTransformerLayerPipe - 35: - 36: MixedFusedLayerNorm - 37: EmbeddingPipe - 38: float16_to_fp32 - loss: CrossEntropy - > number of parameters on (tensor, pipeline) model parallel rank (3, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (2, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (3, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (1, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (1, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (0, 7): 1986498560 -[2021-09-27 17:44:06,136] [INFO] [utils.py:680:see_memory_usage] After Building Model -[2021-09-27 17:44:06,136] [INFO] [utils.py:681:see_memory_usage] MA 3.77 GB Max_MA 3.79 GB CA 3.79 GB Max_CA 4 GB -[2021-09-27 17:44:06,137] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 37.66 GB, percent = 20.1% - > number of parameters on (tensor, pipeline) model parallel rank (0, 0): 1986465792 -setting training iterations to 159576 -> learning rate decay style: cosine -DeepSpeed is enabled. -[2021-09-27 17:44:06,204] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.4.2+bc17042, git-hash=bc17042, git-branch=big-science -[2021-09-27 17:44:06,292] [INFO] [engine.py:179:__init__] DeepSpeed Flops Profiler Enabled: False -[2021-09-27 17:44:06,292] [INFO] [engine.py:736:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer -[2021-09-27 17:44:06,292] [INFO] [engine.py:741:_configure_optimizer] Using client Optimizer as basic optimizer -[2021-09-27 17:44:06,292] [INFO] [engine.py:750:_configure_optimizer] DeepSpeed Basic Optimizer = FusedAdam -[2021-09-27 17:44:06,292] [INFO] [utils.py:43:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= -[2021-09-27 17:44:06,292] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 1 optimizer -[2021-09-27 17:44:06,292] [INFO] [stage2.py:106:__init__] Reduce bucket size 500000000 -[2021-09-27 17:44:06,292] [INFO] [stage2.py:107:__init__] Allgather bucket size 500000000 -[2021-09-27 17:44:06,292] [INFO] [stage2.py:108:__init__] CPU Offload: False -[2021-09-27 17:44:06,292] [INFO] [stage2.py:109:__init__] Round robin gradient partitioning: False -[2021-09-27 17:44:11,004] [INFO] [stage2.py:419:__init__] optimizer state initialized -[2021-09-27 17:44:11,004] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam -[2021-09-27 17:44:11,004] [INFO] [engine.py:553:_configure_lr_scheduler] DeepSpeed using client LR scheduler -[2021-09-27 17:44:11,005] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = -[2021-09-27 17:44:11,005] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999)] -[2021-09-27 17:44:11,005] [INFO] [config.py:900:print] DeepSpeedEngine configuration: -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] activation_checkpointing_config { - "partition_activations": false, - "contiguous_memory_optimization": false, - "cpu_checkpointing": false, - "number_checkpoints": null, - "synchronize_checkpoint_boundary": false, - "profile": false -} -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] allreduce_always_fp32 ........ False -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] amp_enabled .................. False -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] amp_params ................... False -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] checkpoint_tag_validation_enabled True -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] checkpoint_tag_validation_fail False -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] disable_allgather ............ False -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] dump_state ................... False -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 500, 'delayed_shift': 2, 'min_scale': 1} -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] eigenvalue_enabled ........... False -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] eigenvalue_gas_boundary_resolution 1 -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] eigenvalue_layer_name ........ bert.encoder.layer -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] eigenvalue_layer_num ......... 0 -[2021-09-27 17:44:11,005] [INFO] [config.py:904:print] eigenvalue_max_iter .......... 100 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] eigenvalue_stability ......... 1e-06 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] eigenvalue_tol ............... 0.01 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] eigenvalue_verbose ........... False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] elasticity_enabled ........... False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] flops_profiler_config ........ { - "enabled": false, - "profile_step": 1, - "module_depth": -1, - "top_modules": 1, - "detailed": true, - "output_file": null -} -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] fp16_enabled ................. True -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] fp16_mixed_quantize .......... False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] global_rank .................. 0 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] gradient_accumulation_steps .. 256 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] gradient_clipping ............ 1.0 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] gradient_predivide_factor .... 1.0 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] initial_dynamic_scale ........ 4096 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] loss_scale ................... 0 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] memory_breakdown ............. False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] optimizer_legacy_fusion ...... False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] optimizer_name ............... None -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] optimizer_params ............. None -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] pld_enabled .................. False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] pld_params ................... False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] prescale_gradients ........... False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_change_rate ......... 0.001 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_groups .............. 1 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_offset .............. 1000 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_period .............. 1000 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_rounding ............ 0 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_start_bits .......... 16 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_target_bits ......... 8 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_training_enabled .... False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_type ................ 0 -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] quantize_verbose ............. False -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] scheduler_name ............... None -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] scheduler_params ............. None -[2021-09-27 17:44:11,006] [INFO] [config.py:904:print] sparse_attention ............. None -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] sparse_gradients_enabled ..... False -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] steps_per_print .............. 2000 -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] tensorboard_enabled .......... False -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] tensorboard_job_name ......... DeepSpeedJobName -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] tensorboard_output_path ...... -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] train_batch_size ............. 2048 -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] train_micro_batch_size_per_gpu 1 -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] use_quantizer_kernel ......... False -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] wall_clock_breakdown ......... False -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] world_size ................... 8 -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] zero_allow_untested_optimizer False -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] zero_config .................. { - "stage": 1, - "contiguous_gradients": false, - "reduce_scatter": true, - "reduce_bucket_size": 5.000000e+08, - "allgather_partitions": true, - "allgather_bucket_size": 5.000000e+08, - "overlap_comm": false, - "load_from_fp32_weights": true, - "elastic_checkpoint": true, - "offload_param": null, - "offload_optimizer": null, - "sub_group_size": 1.000000e+09, - "prefetch_bucket_size": 5.000000e+07, - "param_persistence_threshold": 1.000000e+05, - "max_live_parameters": 1.000000e+09, - "max_reuse_distance": 1.000000e+09, - "gather_fp16_weights_on_model_save": false, - "ignore_unused_parameters": true, - "round_robin_gradients": false, - "legacy_stage1": false -} -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] zero_enabled ................. True -[2021-09-27 17:44:11,007] [INFO] [config.py:904:print] zero_optimization_stage ...... 1 -[2021-09-27 17:44:11,007] [INFO] [config.py:906:print] json = { - "train_micro_batch_size_per_gpu": 1, - "train_batch_size": 2.048000e+03, - "gradient_clipping": 1.0, - "zero_optimization": { - "stage": 1 - }, - "fp16": { - "enabled": true, - "loss_scale": 0, - "loss_scale_window": 500, - "hysteresis": 2, - "min_loss_scale": 1, - "initial_scale_power": 12 - }, - "steps_per_print": 2.000000e+03, - "wall_clock_breakdown": false -} -[2021-09-27 17:44:11,007] [INFO] [engine.py:76:__init__] CONFIG: micro_batches=256 micro_batch_size=1 -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=0 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=1 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=3 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=2 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=131 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=129 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=128 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=130 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=224 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=225 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=227 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=226 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=65 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=66 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=64 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=67 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=193 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=194 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=195 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=192 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=32 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=35 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=34 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=33 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=99 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=98 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=97 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=161 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=160 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=162 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=163 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-27 17:44:11,311] [INFO] [engine.py:134:__init__] RANK=96 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) - > using checkpoint value 6e-05 for learning rate - > using checkpoint value 6e-06 for minimum learning rate - > using checkpoint value 216320 for warmup iterations - > using checkpoint value 126953125 for total number of iterations - > using checkpoint value cosine for decay style -successfully loaded 8 ZeRO state_dicts for rank 196 -successfully loaded 8 ZeRO state_dicts for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 195 -successfully loaded 8 ZeRO state_dicts for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 154 -successfully loaded 8 ZeRO state_dicts for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 198 -successfully loaded 8 ZeRO state_dicts for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 42 -successfully loaded 8 ZeRO state_dicts for rank 199 -successfully loaded 8 ZeRO state_dicts for rank 120 -successfully loaded 8 ZeRO state_dicts for rank 205 -successfully loaded 8 ZeRO state_dicts for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 116 -successfully loaded 8 ZeRO state_dicts for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 100 -successfully loaded 8 ZeRO state_dicts for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 134 -successfully loaded 8 ZeRO state_dicts for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 67 -successfully loaded 8 ZeRO state_dicts for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 145 -successfully loaded 8 ZeRO state_dicts for rank 182 -successfully loaded 8 ZeRO state_dicts for rank 133 -successfully loaded 8 ZeRO state_dicts for rank 65 -successfully loaded 8 ZeRO state_dicts for rank 208 -successfully loaded 8 ZeRO state_dicts for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 130 -successfully loaded 8 ZeRO state_dicts for rank 141 -successfully loaded 8 ZeRO state_dicts for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 115 -successfully loaded 8 ZeRO state_dicts for rank 200 -successfully loaded 8 ZeRO state_dicts for rank 152 -successfully loaded 8 ZeRO state_dicts for rank 89 -successfully loaded 8 ZeRO state_dicts for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 75 -loading 8 zero partition checkpoints for rank 196 -successfully loaded 8 ZeRO state_dicts for rank 122 -successfully loaded 8 ZeRO state_dicts for rank 153 -successfully loaded 8 ZeRO state_dicts for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 216 -successfully loaded 8 ZeRO state_dicts for rank 57 -successfully loaded 8 ZeRO state_dicts for rank 165 -successfully loaded 8 ZeRO state_dicts for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 83 -successfully loaded 8 ZeRO state_dicts for rank 99 -successfully loaded 8 ZeRO state_dicts for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 77 -successfully loaded 8 ZeRO state_dicts for rank 190 -successfully loaded 8 ZeRO state_dicts for rank 146 -successfully loaded 8 ZeRO state_dicts for rank 36 -successfully loaded 8 ZeRO state_dicts for rank 114 -successfully loaded 8 ZeRO state_dicts for rank 129 -successfully loaded 8 ZeRO state_dicts for rank 54 -successfully loaded 8 ZeRO state_dicts for rank 98 -successfully loaded 8 ZeRO state_dicts for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 93 -successfully loaded 8 ZeRO state_dicts for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 64 -successfully loaded 8 ZeRO state_dicts for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 58 -successfully loaded 8 ZeRO state_dicts for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 103 -successfully loaded 8 ZeRO state_dicts for rank 80 -successfully loaded 8 ZeRO state_dicts for rank 34 -successfully loaded 8 ZeRO state_dicts for rank 149 -successfully loaded 8 ZeRO state_dicts for rank 87 -successfully loaded 8 ZeRO state_dicts for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 41 -successfully loaded 8 ZeRO state_dicts for rank 84 -successfully loaded 8 ZeRO state_dicts for rank 37 -successfully loaded 8 ZeRO state_dicts for rank 38 -successfully loaded 8 ZeRO state_dicts for rank 107 -successfully loaded 8 ZeRO state_dicts for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 44 -successfully loaded 8 ZeRO state_dicts for rank 53 -loading 8 zero partition checkpoints for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 79 -successfully loaded 8 ZeRO state_dicts for rank 164 -successfully loaded 8 ZeRO state_dicts for rank 46 -successfully loaded 8 ZeRO state_dicts for rank 73 -successfully loaded 8 ZeRO state_dicts for rank 91 -successfully loaded 8 ZeRO state_dicts for rank 106 -successfully loaded 8 ZeRO state_dicts for rank 71 -loading 8 zero partition checkpoints for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 138 -successfully loaded 8 ZeRO state_dicts for rank 33 -successfully loaded 8 ZeRO state_dicts for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 178 -successfully loaded 8 ZeRO state_dicts for rank 39 -successfully loaded 8 ZeRO state_dicts for rank 111 -successfully loaded 8 ZeRO state_dicts for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 191 -successfully loaded 8 ZeRO state_dicts for rank 147 -successfully loaded 8 ZeRO state_dicts for rank 167 -successfully loaded 8 ZeRO state_dicts for rank 170 -successfully loaded 8 ZeRO state_dicts for rank 95 -successfully loaded 8 ZeRO state_dicts for rank 142 -successfully loaded 8 ZeRO state_dicts for rank 151 -successfully loaded 8 ZeRO state_dicts for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 118 -successfully loaded 8 ZeRO state_dicts for rank 97 -successfully loaded 8 ZeRO state_dicts for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 174 -successfully loaded 8 ZeRO state_dicts for rank 219 -successfully loaded 8 ZeRO state_dicts for rank 211 -successfully loaded 8 ZeRO state_dicts for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 43 -successfully loaded 8 ZeRO state_dicts for rank 171 -successfully loaded 8 ZeRO state_dicts for rank 55 -successfully loaded 8 ZeRO state_dicts for rank 59 -successfully loaded 8 ZeRO state_dicts for rank 203 -successfully loaded 8 ZeRO state_dicts for rank 45 -successfully loaded 8 ZeRO state_dicts for rank 210 -successfully loaded 8 ZeRO state_dicts for rank 50 -successfully loaded 8 ZeRO state_dicts for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 68 -successfully loaded 8 ZeRO state_dicts for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 187 -successfully loaded 8 ZeRO state_dicts for rank 186 -successfully loaded 8 ZeRO state_dicts for rank 102 -successfully loaded 8 ZeRO state_dicts for rank 109 -successfully loaded 8 ZeRO state_dicts for rank 56 -successfully loaded 8 ZeRO state_dicts for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 81 -successfully loaded 8 ZeRO state_dicts for rank 169 -successfully loaded 8 ZeRO state_dicts for rank 202 -successfully loaded 8 ZeRO state_dicts for rank 10 -successfully loaded 8 ZeRO state_dicts for rank 110 -loading 8 zero partition checkpoints for rank 195 -successfully loaded 8 ZeRO state_dicts for rank 197 -successfully loaded 8 ZeRO state_dicts for rank 119 -successfully loaded 8 ZeRO state_dicts for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 88 -successfully loaded 8 ZeRO state_dicts for rank 92 -successfully loaded 8 ZeRO state_dicts for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 223 -successfully loaded 8 ZeRO state_dicts for rank 126 -successfully loaded 8 ZeRO state_dicts for rank 162 -loading 8 zero partition checkpoints for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 173 -successfully loaded 8 ZeRO state_dicts for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 90 -successfully loaded 8 ZeRO state_dicts for rank 121 -successfully loaded 8 ZeRO state_dicts for rank 123 -successfully loaded 8 ZeRO state_dicts for rank 163 -successfully loaded 8 ZeRO state_dicts for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 51 -successfully loaded 8 ZeRO state_dicts for rank 78 -successfully loaded 8 ZeRO state_dicts for rank 213 -successfully loaded 8 ZeRO state_dicts for rank 181 -successfully loaded 8 ZeRO state_dicts for rank 194 -successfully loaded 8 ZeRO state_dicts for rank 218 -successfully loaded 8 ZeRO state_dicts for rank 35 -successfully loaded 8 ZeRO state_dicts for rank 22 -successfully loaded 8 ZeRO state_dicts for rank 188 -successfully loaded 8 ZeRO state_dicts for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 47 -successfully loaded 8 ZeRO state_dicts for rank 175 -successfully loaded 8 ZeRO state_dicts for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 69 -successfully loaded 8 ZeRO state_dicts for rank 85 -loading 8 zero partition checkpoints for rank 154 -successfully loaded 8 ZeRO state_dicts for rank 66 -successfully loaded 8 ZeRO state_dicts for rank 117 -successfully loaded 8 ZeRO state_dicts for rank 161 -successfully loaded 8 ZeRO state_dicts for rank 49 -successfully loaded 8 ZeRO state_dicts for rank 86 -successfully loaded 8 ZeRO state_dicts for rank 101 -successfully loaded 8 ZeRO state_dicts for rank 222 -successfully loaded 8 ZeRO state_dicts for rank 70 -successfully loaded 8 ZeRO state_dicts for rank 30 -successfully loaded 8 ZeRO state_dicts for rank 131 -successfully loaded 8 ZeRO state_dicts for rank 183 -loading 8 zero partition checkpoints for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 94 -successfully loaded 8 ZeRO state_dicts for rank 217 -successfully loaded 8 ZeRO state_dicts for rank 82 -successfully loaded 8 ZeRO state_dicts for rank 8 -successfully loaded 8 ZeRO state_dicts for rank 160 -successfully loaded 8 ZeRO state_dicts for rank 252 -loading 8 zero partition checkpoints for rank 205 -successfully loaded 8 ZeRO state_dicts for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 14 -loading 8 zero partition checkpoints for rank 42 -loading 8 zero partition checkpoints for rank 104 -loading 8 zero partition checkpoints for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 189 -successfully loaded 8 ZeRO state_dicts for rank 232 -successfully loaded 8 ZeRO state_dicts for rank 177 -loading 8 zero partition checkpoints for rank 120 -successfully loaded 8 ZeRO state_dicts for rank 228 -successfully loaded 8 ZeRO state_dicts for rank 185 -successfully loaded 8 ZeRO state_dicts for rank 209 -successfully loaded 8 ZeRO state_dicts for rank 235 -successfully loaded 8 ZeRO state_dicts for rank 244 -successfully loaded 8 ZeRO state_dicts for rank 236 -successfully loaded 8 ZeRO state_dicts for rank 31 -loading 8 zero partition checkpoints for rank 116 -successfully loaded 8 ZeRO state_dicts for rank 224 -loading 8 zero partition checkpoints for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 74 -loading 8 zero partition checkpoints for rank 166 -loading 8 zero partition checkpoints for rank 134 -successfully loaded 8 ZeRO state_dicts for rank 26 -successfully loaded 8 ZeRO state_dicts for rank 176 -loading 8 zero partition checkpoints for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 251 -successfully loaded 8 ZeRO state_dicts for rank 15 -successfully loaded 8 ZeRO state_dicts for rank 4 -loading 8 zero partition checkpoints for rank 199 -loading 8 zero partition checkpoints for rank 133 -loading 8 zero partition checkpoints for rank 198 -loading 8 zero partition checkpoints for rank 67 -successfully loaded 8 ZeRO state_dicts for rank 18 -successfully loaded 8 ZeRO state_dicts for rank 179 -loading 8 zero partition checkpoints for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 247 -successfully loaded 8 ZeRO state_dicts for rank 11 -successfully loaded 8 ZeRO state_dicts for rank 28 -loading 8 zero partition checkpoints for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 229 -loading 8 zero partition checkpoints for rank 65 -successfully loaded 8 ZeRO state_dicts for rank 7 -successfully loaded 8 ZeRO state_dicts for rank 248 -successfully loaded 8 ZeRO state_dicts for rank 221 -loading 8 zero partition checkpoints for rank 182 -loading 8 zero partition checkpoints for rank 130 -successfully loaded 8 ZeRO state_dicts for rank 238 -successfully loaded 8 ZeRO state_dicts for rank 12 -loading 8 zero partition checkpoints for rank 145 -successfully loaded 8 ZeRO state_dicts for rank 234 -successfully loaded 8 ZeRO state_dicts for rank 6 -loading 8 zero partition checkpoints for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 27 -successfully loaded 8 ZeRO state_dicts for rank 250 -loading 8 zero partition checkpoints for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 225 -successfully loaded 8 ZeRO state_dicts for rank 23 -loading 8 zero partition checkpoints for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 19 -successfully loaded 8 ZeRO state_dicts for rank 3 -loading 8 zero partition checkpoints for rank 89 -loading 8 zero partition checkpoints for rank 141 -loading 8 zero partition checkpoints for rank 122 -loading 8 zero partition checkpoints for rank 75 -successfully loaded 8 ZeRO state_dicts for rank 239 -successfully loaded 8 ZeRO state_dicts for rank 241 -successfully loaded 8 ZeRO state_dicts for rank 245 -successfully loaded 8 ZeRO state_dicts for rank 243 -successfully loaded 8 ZeRO state_dicts for rank 0 -successfully loaded 8 ZeRO state_dicts for rank 20 -successfully loaded 8 ZeRO state_dicts for rank 24 -loading 8 zero partition checkpoints for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 231 -successfully loaded 8 ZeRO state_dicts for rank 29 -loading 8 zero partition checkpoints for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 240 -successfully loaded 8 ZeRO state_dicts for rank 2 -successfully loaded 8 ZeRO state_dicts for rank 16 -loading 8 zero partition checkpoints for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 233 -successfully loaded 8 ZeRO state_dicts for rank 253 -successfully loaded 8 ZeRO state_dicts for rank 255 -successfully loaded 8 ZeRO state_dicts for rank 242 -successfully loaded 8 ZeRO state_dicts for rank 237 -loading 8 zero partition checkpoints for rank 83 -successfully loaded 8 ZeRO state_dicts for rank 254 -loading 8 zero partition checkpoints for rank 165 -loading 8 zero partition checkpoints for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 246 -loading 8 zero partition checkpoints for rank 77 -loading 8 zero partition checkpoints for rank 99 -loading 8 zero partition checkpoints for rank 152 -loading 8 zero partition checkpoints for rank 216 -loading 8 zero partition checkpoints for rank 36 -loading 8 zero partition checkpoints for rank 115 -loading 8 zero partition checkpoints for rank 54 -loading 8 zero partition checkpoints for rank 190 -loading 8 zero partition checkpoints for rank 146 -loading 8 zero partition checkpoints for rank 98 -loading 8 zero partition checkpoints for rank 100 -loading 8 zero partition checkpoints for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 13 -successfully loaded 8 ZeRO state_dicts for rank 226 -successfully loaded 8 ZeRO state_dicts for rank 9 -loading 8 zero partition checkpoints for rank 153 -loading 8 zero partition checkpoints for rank 64 -successfully loaded 8 ZeRO state_dicts for rank 5 -successfully loaded 8 ZeRO state_dicts for rank 249 -loading 8 zero partition checkpoints for rank 155 -loading 8 zero partition checkpoints for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 17 -successfully loaded 8 ZeRO state_dicts for rank 230 -loading 8 zero partition checkpoints for rank 80 -loading 8 zero partition checkpoints for rank 149 -loading 8 zero partition checkpoints for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 1 -successfully loaded 8 ZeRO state_dicts for rank 227 -loading 8 zero partition checkpoints for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 21 -loading 8 zero partition checkpoints for rank 41 -loading 8 zero partition checkpoints for rank 107 -loading 8 zero partition checkpoints for rank 34 -loading 8 zero partition checkpoints for rank 87 -loading 8 zero partition checkpoints for rank 212 -loading 8 zero partition checkpoints for rank 220 -loading 8 zero partition checkpoints for rank 44 -loading 8 zero partition checkpoints for rank 73 -loading 8 zero partition checkpoints for rank 33 -loading 8 zero partition checkpoints for rank 164 -loading 8 zero partition checkpoints for rank 111 -loading 8 zero partition checkpoints for rank 106 -loading 8 zero partition checkpoints for rank 167 -loading 8 zero partition checkpoints for rank 39 -loading 8 zero partition checkpoints for rank 46 -loading 8 zero partition checkpoints for rank 201 -loading 8 zero partition checkpoints for rank 151 -loading 8 zero partition checkpoints for rank 118 -loading 8 zero partition checkpoints for rank 71 -loading 8 zero partition checkpoints for rank 59 -loading 8 zero partition checkpoints for rank 114 -loading 8 zero partition checkpoints for rank 159 -loading 8 zero partition checkpoints for rank 57 -loading 8 zero partition checkpoints for rank 43 -loading 8 zero partition checkpoints for rank 97 -loading 8 zero partition checkpoints for rank 219 -loading 8 zero partition checkpoints for rank 113 -loading 8 zero partition checkpoints for rank 55 -loading 8 zero partition checkpoints for rank 61 -loading 8 zero partition checkpoints for rank 203 -loading 8 zero partition checkpoints for rank 211 -loading 8 zero partition checkpoints for rank 50 -loading 8 zero partition checkpoints for rank 48 -loading 8 zero partition checkpoints for rank 200 -loading 8 zero partition checkpoints for rank 191 -loading 8 zero partition checkpoints for rank 169 -loading 8 zero partition checkpoints for rank 102 -loading 8 zero partition checkpoints for rank 81 -loading 8 zero partition checkpoints for rank 56 -loading 8 zero partition checkpoints for rank 147 -loading 8 zero partition checkpoints for rank 84 -loading 8 zero partition checkpoints for rank 136 -loading 8 zero partition checkpoints for rank 210 -loading 8 zero partition checkpoints for rank 178 -loading 8 zero partition checkpoints for rank 105 -loading 8 zero partition checkpoints for rank 223 -loading 8 zero partition checkpoints for rank 197 -loading 8 zero partition checkpoints for rank 170 -loading 8 zero partition checkpoints for rank 135 -loading 8 zero partition checkpoints for rank 45 -loading 8 zero partition checkpoints for rank 119 -loading 8 zero partition checkpoints for rank 180 -loading 8 zero partition checkpoints for rank 173 -loading 8 zero partition checkpoints for rank 123 -loading 8 zero partition checkpoints for rank 125 -loading 8 zero partition checkpoints for rank 171 -loading 8 zero partition checkpoints for rank 186 -loading 8 zero partition checkpoints for rank 109 -loading 8 zero partition checkpoints for rank 52 -loading 8 zero partition checkpoints for rank 121 -loading 8 zero partition checkpoints for rank 58 -loading 8 zero partition checkpoints for rank 53 -loading 8 zero partition checkpoints for rank 218 -loading 8 zero partition checkpoints for rank 168 -loading 8 zero partition checkpoints for rank 181 -loading 8 zero partition checkpoints for rank 188 -loading 8 zero partition checkpoints for rank 194 -loading 8 zero partition checkpoints for rank 92 -loading 8 zero partition checkpoints for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 25 -loading 8 zero partition checkpoints for rank 156 -loading 8 zero partition checkpoints for rank 161 -loading 8 zero partition checkpoints for rank 131 -loading 8 zero partition checkpoints for rank 63 -loading 8 zero partition checkpoints for rank 35 -loading 8 zero partition checkpoints for rank 66 -loading 8 zero partition checkpoints for rank 90 -loading 8 zero partition checkpoints for rank 163 -loading 8 zero partition checkpoints for rank 93 -loading 8 zero partition checkpoints for rank 86 -loading 8 zero partition checkpoints for rank 183 -loading 8 zero partition checkpoints for rank 117 -loading 8 zero partition checkpoints for rank 103 -loading 8 zero partition checkpoints for rank 47 -loading 8 zero partition checkpoints for rank 10 -loading 8 zero partition checkpoints for rank 82 -loading 8 zero partition checkpoints for rank 69 -loading 8 zero partition checkpoints for rank 60 -loading 8 zero partition checkpoints for rank 101 -loading 8 zero partition checkpoints for rank 94 -loading 8 zero partition checkpoints for rank 22 -loading 8 zero partition checkpoints for rank 108 -loading 8 zero partition checkpoints for rank 177 -loading 8 zero partition checkpoints for rank 37 -loading 8 zero partition checkpoints for rank 38 -loading 8 zero partition checkpoints for rank 79 -loading 8 zero partition checkpoints for rank 217 -loading 8 zero partition checkpoints for rank 138 -loading 8 zero partition checkpoints for rank 189 -loading 8 zero partition checkpoints for rank 208 -loading 8 zero partition checkpoints for rank 143 -loading 8 zero partition checkpoints for rank 142 -loading 8 zero partition checkpoints for rank 172 -loading 8 zero partition checkpoints for rank 85 -loading 8 zero partition checkpoints for rank 74 -loading 8 zero partition checkpoints for rank 68 -loading 8 zero partition checkpoints for rank 14 -loading 8 zero partition checkpoints for rank 252 -loading 8 zero partition checkpoints for rank 202 -loading 8 zero partition checkpoints for rank 95 -loading 8 zero partition checkpoints for rank 126 -loading 8 zero partition checkpoints for rank 129 -loading 8 zero partition checkpoints for rank 232 -loading 8 zero partition checkpoints for rank 137 -loading 8 zero partition checkpoints for rank 214 -loading 8 zero partition checkpoints for rank 78 -loading 8 zero partition checkpoints for rank 162 -loading 8 zero partition checkpoints for rank 4 -loading 8 zero partition checkpoints for rank 127 -loading 8 zero partition checkpoints for rank 139 -loading 8 zero partition checkpoints for rank 110 -loading 8 zero partition checkpoints for rank 247 -loading 8 zero partition checkpoints for rank 222 -loading 8 zero partition checkpoints for rank 229 -loading 8 zero partition checkpoints for rank 128 -loading 8 zero partition checkpoints for rank 51 -loading 8 zero partition checkpoints for rank 174 -loading 8 zero partition checkpoints for rank 187 -loading 8 zero partition checkpoints for rank 70 -loading 8 zero partition checkpoints for rank 215 -loading 8 zero partition checkpoints for rank 160 -loading 8 zero partition checkpoints for rank 91 -loading 8 zero partition checkpoints for rank 49 -loading 8 zero partition checkpoints for rank 6 -loading 8 zero partition checkpoints for rank 24 -loading 8 zero partition checkpoints for rank 243 -loading 8 zero partition checkpoints for rank 221 -loading 8 zero partition checkpoints for rank 8 -loading 8 zero partition checkpoints for rank 20 -loading 8 zero partition checkpoints for rank 240 -loading 8 zero partition checkpoints for rank 236 -loading 8 zero partition checkpoints for rank 2 -loading 8 zero partition checkpoints for rank 27 -loading 8 zero partition checkpoints for rank 213 -loading 8 zero partition checkpoints for rank 176 -loading 8 zero partition checkpoints for rank 175 -loading 8 zero partition checkpoints for rank 253 -loading 8 zero partition checkpoints for rank 209 -loading 8 zero partition checkpoints for rank 231 -loading 8 zero partition checkpoints for rank 239 -loading 8 zero partition checkpoints for rank 88 -loading 8 zero partition checkpoints for rank 28 -loading 8 zero partition checkpoints for rank 179 -loading 8 zero partition checkpoints for rank 185 -loading 8 zero partition checkpoints for rank 13 -loading 8 zero partition checkpoints for rank 233 -loading 8 zero partition checkpoints for rank 11 -loading 8 zero partition checkpoints for rank 246 -loading 8 zero partition checkpoints for rank 9 -loading 8 zero partition checkpoints for rank 224 -loading 8 zero partition checkpoints for rank 248 -loading 8 zero partition checkpoints for rank 251 -loading 8 zero partition checkpoints for rank 1 -loading 8 zero partition checkpoints for rank 29 -loading 8 zero partition checkpoints for rank 235 -loading 8 zero partition checkpoints for rank 250 -loading 8 zero partition checkpoints for rank 23 -loading 8 zero partition checkpoints for rank 244 -loading 8 zero partition checkpoints for rank 241 -loading 8 zero partition checkpoints for rank 225 -loading 8 zero partition checkpoints for rank 18 -loading 8 zero partition checkpoints for rank 234 -loading 8 zero partition checkpoints for rank 3 -loading 8 zero partition checkpoints for rank 242 -loading 8 zero partition checkpoints for rank 0 - checkpoint version 3.0 -loading 8 zero partition checkpoints for rank 21 -loading 8 zero partition checkpoints for rank 249 -loading 8 zero partition checkpoints for rank 245 -loading 8 zero partition checkpoints for rank 228 -loading 8 zero partition checkpoints for rank 26 -loading 8 zero partition checkpoints for rank 30 -loading 8 zero partition checkpoints for rank 19 -loading 8 zero partition checkpoints for rank 15 -loading 8 zero partition checkpoints for rank 7 -loading 8 zero partition checkpoints for rank 238 -loading 8 zero partition checkpoints for rank 17 -loading 8 zero partition checkpoints for rank 31 -loading 8 zero partition checkpoints for rank 255 -loading 8 zero partition checkpoints for rank 12 -loading 8 zero partition checkpoints for rank 237 -loading 8 zero partition checkpoints for rank 16 -loading 8 zero partition checkpoints for rank 254 -loading 8 zero partition checkpoints for rank 230 -loading 8 zero partition checkpoints for rank 5 -loading 8 zero partition checkpoints for rank 25 -loading 8 zero partition checkpoints for rank 226 -loading 8 zero partition checkpoints for rank 227 - successfully loaded checkpoint from /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints at iteration 6210 -time (ms) | load-checkpoint: 56578.08 -[after model, optimizer, and learning rate scheduler are built] datetime: 2021-09-27 17:45:07 -> building train, validation, and test datasets ... - > datasets target sizes (minimum size): - train: 300000000 - validation: 1638400 - test: 10240 -> building train, validation, and test datasets for GPT ... - > building dataset index ... - reading sizes... - reading pointers... - reading document index... - creating numpy buffer of mmap... - creating memory view of numpy buffer... - > finished creating indexed dataset in 0.174718 seconds - number of documents: 304230423 - > dataset split: - train: - document indices in [0, 288714672) total of 288714672 documents - validation: - document indices in [288714672, 303926193) total of 15211521 documents - test: - document indices in [303926193, 304230423) total of 304230 documents - > WARNING: could not find index map files, building the indices on rank 0 ... - > last epoch number of samples (36925554) is smaller than 80% of number of samples per epoch (131537223), setting separate_last_epoch to True -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-27 17:46:37 CEST)" was missed by 0:00:21.460713 - > elasped time to build and save doc-idx mapping (seconds): 74.353737 - using: - number of documents: 288714672 - number of epochs: 3 - sequence length: 2048 - total number of samples: 394611669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-27 17:47:37 CEST)" was missed by 0:00:11.662010 - > elasped time to build and save sample-idx mapping (seconds): 24.775998 - > building shuffle index with split [0, 263074446) and [263074446, 394611669) ... - > elasped time to build and save shuffle-idx mapping (seconds): 26.026031 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_43s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_43s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_43s_shuffle_idx.npy - loaded indexed file in 0.089 seconds - total number of samples: 394611670 - total number of epochs: 3 - > WARNING: could not find index map files, building the indices on rank 0 ... - > only one epoch required, setting separate_last_epoch to False - > elasped time to build and save doc-idx mapping (seconds): 0.979826 - using: - number of documents: 15211521 - number of epochs: 1 - sequence length: 2048 - total number of samples: 6927160 - > elasped time to build and save sample-idx mapping (seconds): 0.364344 - > building shuffle index with split [0, 6927160) and [6927160, 6927160) ... - > elasped time to build and save shuffle-idx mapping (seconds): 0.312714 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_43s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_43s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_43s_shuffle_idx.npy - loaded indexed file in 0.034 seconds - total number of samples: 6927161 - total number of epochs: 1 - > WARNING: could not find index map files, building the indices on rank 0 ... - > only one epoch required, setting separate_last_epoch to False - > elasped time to build and save doc-idx mapping (seconds): 0.019056 - using: - number of documents: 304230 - number of epochs: 1 - sequence length: 2048 - total number of samples: 137383 - > elasped time to build and save sample-idx mapping (seconds): 0.007505 - > building shuffle index with split [0, 137383) and [137383, 137383) ... - > elasped time to build and save shuffle-idx mapping (seconds): 0.021865 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_43s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_43s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_43s_shuffle_idx.npy - loaded indexed file in 0.110 seconds - total number of samples: 137384 - total number of epochs: 1 -> finished creating GPT datasets ... -[after dataloaders are built] datetime: 2021-09-27 17:47:20 -done with setup ... -training ... -time (ms) | model-and-optimizer-setup: 64587.82 | train/valid/test-data-iterators-setup: 131511.20 -[before the start of training step] datetime: 2021-09-27 17:47:20 -[2021-09-27 17:47:20,277] [INFO] [checkpointing.py:408:forward] Activation Checkpointing Information -[2021-09-27 17:47:20,277] [INFO] [checkpointing.py:409:forward] ----Partition Activations False, CPU CHECKPOINTING False -[2021-09-27 17:47:20,277] [INFO] [checkpointing.py:412:forward] ----contiguous Memory Checkpointing False with 32 total layers -[2021-09-27 17:47:20,277] [INFO] [checkpointing.py:415:forward] ----Synchronization False -[2021-09-27 17:47:20,277] [INFO] [checkpointing.py:416:forward] ----Profiling time in checkpointing False -[Rank 225] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11885.68798828125 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 226] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11885.6884765625 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 1] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13899.01416015625 | reserved: 23278.0 | max reserved: 23278.0 -[Rank 2] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13899.01416015625 | reserved: 23278.0 | max reserved: 23278.0 -[Rank 0] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13899.01416015625 | reserved: 23246.0 | max reserved: 23246.0 -[Rank 224] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11885.68994140625 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 227] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11885.6884765625 | reserved: 21700.0 | max reserved: 21700.0 - iteration 6220/ 159576 | consumed samples: 194400 | elapsed time per iteration (ms): 19180.4 | learning rate: 5.378E-05 | global batch size: 80 | lm loss: 6.355129E+00 | loss scale: 4096.0 | grad norm: 93535.397 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[Rank 3] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13899.01416015625 | reserved: 23278.0 | max reserved: 23278.0 -[Rank 33] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20130.0 | max reserved: 20130.0 -[Rank 66] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 19950.0 | max reserved: 19950.0 -[Rank 34] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20250.0 | max reserved: 20250.0 -[Rank 98] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19886.0 | max reserved: 19886.0 -[Rank 130] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19338.0 | max reserved: 19338.0 -[Rank 97] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19402.0 | max reserved: 19402.0 -[Rank 161] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 20170.0 | max reserved: 20170.0 -[Rank 129] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19050.0 | max reserved: 19050.0 -[Rank 193] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 18826.0 | max reserved: 18826.0 -[Rank 65] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 19582.0 | max reserved: 19582.0 -[Rank 194] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 18970.0 | max reserved: 18970.0 -[Rank 162] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 19146.0 | max reserved: 19146.0 -[Rank 32] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20676.0 | max reserved: 20676.0 -[Rank 96] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 20296.0 | max reserved: 20296.0 -[Rank 64] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 20392.0 | max reserved: 20392.0 -[Rank 35] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12082.4677734375 | reserved: 20030.0 | max reserved: 20030.0 -[Rank 160] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 19636.0 | max reserved: 19636.0 -[Rank 192] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 19012.0 | max reserved: 19012.0 -[Rank 128] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 20008.0 | max reserved: 20008.0 -[Rank 99] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11538.466796875 | reserved: 19870.0 | max reserved: 19870.0 -[Rank 67] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11810.46728515625 | reserved: 19582.0 | max reserved: 19582.0 -[Rank 131] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11266.46630859375 | reserved: 19278.0 | max reserved: 19278.0 -[Rank 195] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10722.46533203125 | reserved: 18970.0 | max reserved: 18970.0 -[Rank 163] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10994.4658203125 | reserved: 18826.0 | max reserved: 18826.0 - iteration 6230/ 159576 | consumed samples: 195200 | elapsed time per iteration (ms): 17628.9 | learning rate: 5.400E-05 | global batch size: 80 | lm loss: 6.325471E+00 | loss scale: 4096.0 | grad norm: 104626.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6240/ 159576 | consumed samples: 196000 | elapsed time per iteration (ms): 17585.3 | learning rate: 5.423E-05 | global batch size: 80 | lm loss: 6.313773E+00 | loss scale: 4096.0 | grad norm: 104488.785 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6250/ 159576 | consumed samples: 196800 | elapsed time per iteration (ms): 17683.9 | learning rate: 5.445E-05 | global batch size: 80 | lm loss: 6.302388E+00 | loss scale: 4096.0 | grad norm: 99404.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6260/ 159576 | consumed samples: 197600 | elapsed time per iteration (ms): 17834.3 | learning rate: 5.467E-05 | global batch size: 80 | lm loss: 6.322264E+00 | loss scale: 4096.0 | grad norm: 134601.608 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6270/ 159576 | consumed samples: 198400 | elapsed time per iteration (ms): 17647.5 | learning rate: 5.489E-05 | global batch size: 80 | lm loss: 6.319476E+00 | loss scale: 4096.0 | grad norm: 142879.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6280/ 159576 | consumed samples: 199200 | elapsed time per iteration (ms): 17607.4 | learning rate: 5.511E-05 | global batch size: 80 | lm loss: 6.321982E+00 | loss scale: 4096.0 | grad norm: 114136.314 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6290/ 159576 | consumed samples: 200000 | elapsed time per iteration (ms): 17636.6 | learning rate: 5.534E-05 | global batch size: 80 | lm loss: 6.272703E+00 | loss scale: 4096.0 | grad norm: 101011.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6300/ 159576 | consumed samples: 200800 | elapsed time per iteration (ms): 17537.9 | learning rate: 5.556E-05 | global batch size: 80 | lm loss: 6.295881E+00 | loss scale: 4096.0 | grad norm: 116874.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6310/ 159576 | consumed samples: 201600 | elapsed time per iteration (ms): 17634.4 | learning rate: 5.578E-05 | global batch size: 80 | lm loss: 6.324175E+00 | loss scale: 4096.0 | grad norm: 115938.037 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6320/ 159576 | consumed samples: 202400 | elapsed time per iteration (ms): 17796.6 | learning rate: 5.600E-05 | global batch size: 80 | lm loss: 6.301260E+00 | loss scale: 4096.0 | grad norm: 128639.863 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6330/ 159576 | consumed samples: 203200 | elapsed time per iteration (ms): 17684.4 | learning rate: 5.622E-05 | global batch size: 80 | lm loss: 6.325212E+00 | loss scale: 4096.0 | grad norm: 122331.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6340/ 159576 | consumed samples: 204000 | elapsed time per iteration (ms): 17751.1 | learning rate: 5.645E-05 | global batch size: 80 | lm loss: 6.315152E+00 | loss scale: 4096.0 | grad norm: 107257.166 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 18:28:25] PULSE: tr8-104B is running for 44:59 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 6350/ 159576 | consumed samples: 204800 | elapsed time per iteration (ms): 17472.1 | learning rate: 5.667E-05 | global batch size: 80 | lm loss: 6.305837E+00 | loss scale: 4096.0 | grad norm: 92922.842 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6360/ 159576 | consumed samples: 205600 | elapsed time per iteration (ms): 17585.4 | learning rate: 5.689E-05 | global batch size: 80 | lm loss: 6.291708E+00 | loss scale: 4096.0 | grad norm: 128015.015 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6370/ 159576 | consumed samples: 206400 | elapsed time per iteration (ms): 17756.4 | learning rate: 5.711E-05 | global batch size: 80 | lm loss: 6.336868E+00 | loss scale: 4096.0 | grad norm: 132675.737 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6380/ 159576 | consumed samples: 207200 | elapsed time per iteration (ms): 17470.3 | learning rate: 5.733E-05 | global batch size: 80 | lm loss: 6.319473E+00 | loss scale: 4096.0 | grad norm: 121903.409 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6390/ 159576 | consumed samples: 208000 | elapsed time per iteration (ms): 17849.6 | learning rate: 5.755E-05 | global batch size: 80 | lm loss: 6.295473E+00 | loss scale: 4096.0 | grad norm: 108842.830 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6400/ 159576 | consumed samples: 208800 | elapsed time per iteration (ms): 17525.6 | learning rate: 5.778E-05 | global batch size: 80 | lm loss: 6.305953E+00 | loss scale: 4096.0 | grad norm: 110142.091 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6410/ 159576 | consumed samples: 209600 | elapsed time per iteration (ms): 17695.6 | learning rate: 5.800E-05 | global batch size: 80 | lm loss: 6.327058E+00 | loss scale: 4096.0 | grad norm: 149204.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6420/ 159576 | consumed samples: 210400 | elapsed time per iteration (ms): 17590.8 | learning rate: 5.822E-05 | global batch size: 80 | lm loss: 6.301820E+00 | loss scale: 4096.0 | grad norm: 90947.052 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6430/ 159576 | consumed samples: 211200 | elapsed time per iteration (ms): 17793.7 | learning rate: 5.844E-05 | global batch size: 80 | lm loss: 6.343626E+00 | loss scale: 4096.0 | grad norm: 345234.052 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6440/ 159576 | consumed samples: 212000 | elapsed time per iteration (ms): 17631.2 | learning rate: 5.866E-05 | global batch size: 80 | lm loss: 6.323440E+00 | loss scale: 4096.0 | grad norm: 96087.714 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6450/ 159576 | consumed samples: 212800 | elapsed time per iteration (ms): 17688.1 | learning rate: 5.889E-05 | global batch size: 80 | lm loss: 6.310754E+00 | loss scale: 4096.0 | grad norm: 142702.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6460/ 159576 | consumed samples: 213600 | elapsed time per iteration (ms): 17884.9 | learning rate: 5.911E-05 | global batch size: 80 | lm loss: 6.326996E+00 | loss scale: 4096.0 | grad norm: 139353.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6470/ 159576 | consumed samples: 214400 | elapsed time per iteration (ms): 17777.5 | learning rate: 5.933E-05 | global batch size: 80 | lm loss: 6.303541E+00 | loss scale: 4096.0 | grad norm: 163735.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6480/ 159576 | consumed samples: 215200 | elapsed time per iteration (ms): 17758.4 | learning rate: 5.955E-05 | global batch size: 80 | lm loss: 6.318764E+00 | loss scale: 4096.0 | grad norm: 122570.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6490/ 159576 | consumed samples: 216000 | elapsed time per iteration (ms): 17864.1 | learning rate: 5.977E-05 | global batch size: 80 | lm loss: 6.307048E+00 | loss scale: 4096.0 | grad norm: 116946.724 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6500/ 159576 | consumed samples: 216800 | elapsed time per iteration (ms): 17901.7 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.315722E+00 | loss scale: 4096.0 | grad norm: 93922.032 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6510/ 159576 | consumed samples: 217600 | elapsed time per iteration (ms): 17582.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.323491E+00 | loss scale: 4096.0 | grad norm: 148357.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6520/ 159576 | consumed samples: 218400 | elapsed time per iteration (ms): 17725.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.330975E+00 | loss scale: 4096.0 | grad norm: 103909.494 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6530/ 159576 | consumed samples: 219200 | elapsed time per iteration (ms): 17788.4 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.330465E+00 | loss scale: 4096.0 | grad norm: 112690.620 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6540/ 159576 | consumed samples: 220000 | elapsed time per iteration (ms): 17722.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.325342E+00 | loss scale: 4096.0 | grad norm: 74738.856 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6550/ 159576 | consumed samples: 220800 | elapsed time per iteration (ms): 17778.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.338161E+00 | loss scale: 4096.0 | grad norm: 92386.024 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 19:28:18] PULSE: tr8-104B is running for 1:44:52 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 6560/ 159576 | consumed samples: 221600 | elapsed time per iteration (ms): 17633.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.346842E+00 | loss scale: 4096.0 | grad norm: 91412.181 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6570/ 159576 | consumed samples: 222400 | elapsed time per iteration (ms): 17585.9 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.354182E+00 | loss scale: 4096.0 | grad norm: 106016.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6580/ 159576 | consumed samples: 223200 | elapsed time per iteration (ms): 17723.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.339022E+00 | loss scale: 4096.0 | grad norm: 99292.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6590/ 159576 | consumed samples: 224000 | elapsed time per iteration (ms): 17636.7 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.343359E+00 | loss scale: 4096.0 | grad norm: 142334.413 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6600/ 159576 | consumed samples: 224800 | elapsed time per iteration (ms): 17663.9 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.340461E+00 | loss scale: 4096.0 | grad norm: 152141.320 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6610/ 159576 | consumed samples: 225600 | elapsed time per iteration (ms): 17548.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.323914E+00 | loss scale: 4096.0 | grad norm: 170495.198 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6620/ 159576 | consumed samples: 226400 | elapsed time per iteration (ms): 17566.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.304215E+00 | loss scale: 4096.0 | grad norm: 160242.764 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6630/ 159576 | consumed samples: 227200 | elapsed time per iteration (ms): 17951.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.312865E+00 | loss scale: 4096.0 | grad norm: 104923.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6640/ 159576 | consumed samples: 228000 | elapsed time per iteration (ms): 17693.7 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.337115E+00 | loss scale: 4096.0 | grad norm: 162544.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6650/ 159576 | consumed samples: 228800 | elapsed time per iteration (ms): 17707.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.327879E+00 | loss scale: 4096.0 | grad norm: 80497.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6660/ 159576 | consumed samples: 229600 | elapsed time per iteration (ms): 17584.5 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.404206E+00 | loss scale: 4096.0 | grad norm: 136886.090 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6670/ 159576 | consumed samples: 230400 | elapsed time per iteration (ms): 17615.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.359778E+00 | loss scale: 4096.0 | grad norm: 123501.796 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6680/ 159576 | consumed samples: 231200 | elapsed time per iteration (ms): 17812.0 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.318851E+00 | loss scale: 4096.0 | grad norm: 118146.851 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6690/ 159576 | consumed samples: 232000 | elapsed time per iteration (ms): 17690.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.324978E+00 | loss scale: 4096.0 | grad norm: 127513.155 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6700/ 159576 | consumed samples: 232800 | elapsed time per iteration (ms): 17679.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.312429E+00 | loss scale: 4096.0 | grad norm: 141251.517 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6710/ 159576 | consumed samples: 233600 | elapsed time per iteration (ms): 17730.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.304575E+00 | loss scale: 8192.0 | grad norm: 354806.488 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6720/ 159576 | consumed samples: 234400 | elapsed time per iteration (ms): 17817.5 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.343853E+00 | loss scale: 8192.0 | grad norm: 400003.537 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6730/ 159576 | consumed samples: 235200 | elapsed time per iteration (ms): 17886.0 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.329220E+00 | loss scale: 8192.0 | grad norm: 354798.775 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6740/ 159576 | consumed samples: 236000 | elapsed time per iteration (ms): 17869.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.341031E+00 | loss scale: 8192.0 | grad norm: 452433.886 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6750/ 159576 | consumed samples: 236912 | elapsed time per iteration (ms): 18328.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.325079E+00 | loss scale: 8192.0 | grad norm: 272354.067 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6760/ 159576 | consumed samples: 237872 | elapsed time per iteration (ms): 17158.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.350076E+00 | loss scale: 4096.0 | grad norm: 109464.543 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 20:32:07] PULSE: tr8-104B is running for 2:48:41 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 6770/ 159576 | consumed samples: 238832 | elapsed time per iteration (ms): 18779.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.347258E+00 | loss scale: 4096.0 | grad norm: 151362.578 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6780/ 159576 | consumed samples: 239792 | elapsed time per iteration (ms): 18764.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.483617E+00 | loss scale: 4096.0 | grad norm: 144409.728 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6790/ 159576 | consumed samples: 240752 | elapsed time per iteration (ms): 18830.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.459402E+00 | loss scale: 4096.0 | grad norm: 106762.239 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6800/ 159576 | consumed samples: 241712 | elapsed time per iteration (ms): 18594.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.457979E+00 | loss scale: 4096.0 | grad norm: 159826.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6810/ 159576 | consumed samples: 242672 | elapsed time per iteration (ms): 18590.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.445743E+00 | loss scale: 4096.0 | grad norm: 104586.355 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6820/ 159576 | consumed samples: 243632 | elapsed time per iteration (ms): 18726.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.371418E+00 | loss scale: 4096.0 | grad norm: 181059.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6830/ 159576 | consumed samples: 244592 | elapsed time per iteration (ms): 18734.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.385859E+00 | loss scale: 4096.0 | grad norm: 126958.593 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6840/ 159576 | consumed samples: 245552 | elapsed time per iteration (ms): 18634.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.351850E+00 | loss scale: 4096.0 | grad norm: 154126.591 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6850/ 159576 | consumed samples: 246512 | elapsed time per iteration (ms): 18587.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.341198E+00 | loss scale: 4096.0 | grad norm: 133262.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6860/ 159576 | consumed samples: 247472 | elapsed time per iteration (ms): 19013.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.317137E+00 | loss scale: 4096.0 | grad norm: 101860.571 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6870/ 159576 | consumed samples: 248432 | elapsed time per iteration (ms): 18789.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.332655E+00 | loss scale: 4096.0 | grad norm: 467416.787 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6880/ 159576 | consumed samples: 249392 | elapsed time per iteration (ms): 18654.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.385090E+00 | loss scale: 4096.0 | grad norm: 154062.615 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6890/ 159576 | consumed samples: 250352 | elapsed time per iteration (ms): 18644.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.355402E+00 | loss scale: 4096.0 | grad norm: 154349.296 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6900/ 159576 | consumed samples: 251312 | elapsed time per iteration (ms): 18495.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.365808E+00 | loss scale: 4096.0 | grad norm: 95313.572 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6910/ 159576 | consumed samples: 252272 | elapsed time per iteration (ms): 18802.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.598378E+00 | loss scale: 4096.0 | grad norm: 84678.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6920/ 159576 | consumed samples: 253232 | elapsed time per iteration (ms): 18641.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.314456E+00 | loss scale: 4096.0 | grad norm: 122716.232 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6930/ 159576 | consumed samples: 254192 | elapsed time per iteration (ms): 18564.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 9.121927E+00 | loss scale: 4096.0 | grad norm: 283384.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6940/ 159576 | consumed samples: 255152 | elapsed time per iteration (ms): 18549.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 1.023865E+01 | loss scale: 4096.0 | grad norm: 42359.376 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6950/ 159576 | consumed samples: 256112 | elapsed time per iteration (ms): 17675.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 9.249577E+00 | loss scale: 2048.0 | grad norm: 78368.205 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6960/ 159576 | consumed samples: 257072 | elapsed time per iteration (ms): 18443.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 8.389180E+00 | loss scale: 2048.0 | grad norm: 40490.259 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6970/ 159576 | consumed samples: 258032 | elapsed time per iteration (ms): 18545.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.529938E+00 | loss scale: 2048.0 | grad norm: 14218.251 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 21:35:01] PULSE: tr8-104B is running for 3:51:35 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 6980/ 159576 | consumed samples: 258992 | elapsed time per iteration (ms): 18379.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.102215E+00 | loss scale: 2048.0 | grad norm: 18580.148 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6990/ 159576 | consumed samples: 259952 | elapsed time per iteration (ms): 18355.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.018941E+00 | loss scale: 2048.0 | grad norm: 17882.180 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7000/ 159576 | consumed samples: 260912 | elapsed time per iteration (ms): 18505.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.942125E+00 | loss scale: 2048.0 | grad norm: 26860.562 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 7000 | lm loss value: 6.872679E+00 | lm loss PPL: 9.655315E+02 | ------------------------------------------------------------------------------------------------- - iteration 7010/ 159576 | consumed samples: 261872 | elapsed time per iteration (ms): 19970.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.816376E+00 | loss scale: 2048.0 | grad norm: 40294.075 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7020/ 159576 | consumed samples: 262832 | elapsed time per iteration (ms): 18648.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.821559E+00 | loss scale: 2048.0 | grad norm: 25012.263 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7030/ 159576 | consumed samples: 263792 | elapsed time per iteration (ms): 18478.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.893867E+00 | loss scale: 2048.0 | grad norm: 39565.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7040/ 159576 | consumed samples: 264752 | elapsed time per iteration (ms): 18670.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.871474E+00 | loss scale: 2048.0 | grad norm: 22832.888 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7050/ 159576 | consumed samples: 265712 | elapsed time per iteration (ms): 18521.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.875928E+00 | loss scale: 2048.0 | grad norm: 26237.022 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7060/ 159576 | consumed samples: 266672 | elapsed time per iteration (ms): 18543.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.827568E+00 | loss scale: 2048.0 | grad norm: 31639.445 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7070/ 159576 | consumed samples: 267632 | elapsed time per iteration (ms): 18564.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.711889E+00 | loss scale: 2048.0 | grad norm: 46310.481 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7080/ 159576 | consumed samples: 268592 | elapsed time per iteration (ms): 18629.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.683693E+00 | loss scale: 2048.0 | grad norm: 31484.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7090/ 159576 | consumed samples: 269552 | elapsed time per iteration (ms): 18473.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.627121E+00 | loss scale: 2048.0 | grad norm: 45017.258 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7100/ 159576 | consumed samples: 270512 | elapsed time per iteration (ms): 18806.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.627071E+00 | loss scale: 2048.0 | grad norm: 57880.707 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7110/ 159576 | consumed samples: 271472 | elapsed time per iteration (ms): 18537.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.608931E+00 | loss scale: 2048.0 | grad norm: 67724.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7120/ 159576 | consumed samples: 272432 | elapsed time per iteration (ms): 18556.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.592625E+00 | loss scale: 2048.0 | grad norm: 67655.063 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7130/ 159576 | consumed samples: 273392 | elapsed time per iteration (ms): 18620.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.769730E+00 | loss scale: 2048.0 | grad norm: 50594.550 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7140/ 159576 | consumed samples: 274352 | elapsed time per iteration (ms): 18517.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.749163E+00 | loss scale: 2048.0 | grad norm: 30940.535 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7150/ 159576 | consumed samples: 275312 | elapsed time per iteration (ms): 18726.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.695554E+00 | loss scale: 2048.0 | grad norm: 49756.042 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 22:31:42] PULSE: tr8-104B is running for 4:48:16 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 7160/ 159576 | consumed samples: 276272 | elapsed time per iteration (ms): 18567.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.630823E+00 | loss scale: 2048.0 | grad norm: 46573.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7170/ 159576 | consumed samples: 277232 | elapsed time per iteration (ms): 18787.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.637067E+00 | loss scale: 2048.0 | grad norm: 47650.692 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7180/ 159576 | consumed samples: 278192 | elapsed time per iteration (ms): 18669.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.663966E+00 | loss scale: 2048.0 | grad norm: 54677.698 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7190/ 159576 | consumed samples: 279152 | elapsed time per iteration (ms): 18711.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.603532E+00 | loss scale: 2048.0 | grad norm: 75914.515 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7200/ 159576 | consumed samples: 280112 | elapsed time per iteration (ms): 18682.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.571133E+00 | loss scale: 2048.0 | grad norm: 74379.166 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7210/ 159576 | consumed samples: 281072 | elapsed time per iteration (ms): 18622.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.584048E+00 | loss scale: 2048.0 | grad norm: 75888.414 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7220/ 159576 | consumed samples: 282032 | elapsed time per iteration (ms): 18555.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.554535E+00 | loss scale: 2048.0 | grad norm: 90934.334 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7230/ 159576 | consumed samples: 282992 | elapsed time per iteration (ms): 18600.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.558411E+00 | loss scale: 2048.0 | grad norm: 54832.822 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7240/ 159576 | consumed samples: 284032 | elapsed time per iteration (ms): 19119.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.585645E+00 | loss scale: 2048.0 | grad norm: 116769.600 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7250/ 159576 | consumed samples: 285152 | elapsed time per iteration (ms): 19421.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.554094E+00 | loss scale: 2048.0 | grad norm: 79780.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7260/ 159576 | consumed samples: 286272 | elapsed time per iteration (ms): 19643.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.545351E+00 | loss scale: 2048.0 | grad norm: 153165.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7270/ 159576 | consumed samples: 287392 | elapsed time per iteration (ms): 19873.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.548807E+00 | loss scale: 2048.0 | grad norm: 96725.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7280/ 159576 | consumed samples: 288512 | elapsed time per iteration (ms): 19830.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.532312E+00 | loss scale: 2048.0 | grad norm: 85054.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7290/ 159576 | consumed samples: 289632 | elapsed time per iteration (ms): 19469.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.535855E+00 | loss scale: 2048.0 | grad norm: 66255.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7300/ 159576 | consumed samples: 290752 | elapsed time per iteration (ms): 19578.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.583752E+00 | loss scale: 2048.0 | grad norm: 61901.507 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7310/ 159576 | consumed samples: 291872 | elapsed time per iteration (ms): 19646.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.539584E+00 | loss scale: 2048.0 | grad norm: 68238.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7320/ 159576 | consumed samples: 292992 | elapsed time per iteration (ms): 19642.5 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.526649E+00 | loss scale: 2048.0 | grad norm: 69527.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7330/ 159576 | consumed samples: 294112 | elapsed time per iteration (ms): 19508.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.514026E+00 | loss scale: 2048.0 | grad norm: 63745.755 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7340/ 159576 | consumed samples: 295232 | elapsed time per iteration (ms): 19676.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.519949E+00 | loss scale: 2048.0 | grad norm: 96730.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-27 23:32:04] PULSE: tr8-104B is running for 5:48:38 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 7350/ 159576 | consumed samples: 296352 | elapsed time per iteration (ms): 19539.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.510521E+00 | loss scale: 2048.0 | grad norm: 95201.544 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7360/ 159576 | consumed samples: 297472 | elapsed time per iteration (ms): 19834.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.532115E+00 | loss scale: 2048.0 | grad norm: 269153.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7370/ 159576 | consumed samples: 298592 | elapsed time per iteration (ms): 19564.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.501956E+00 | loss scale: 2048.0 | grad norm: 89998.728 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7380/ 159576 | consumed samples: 299712 | elapsed time per iteration (ms): 19672.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.522272E+00 | loss scale: 2048.0 | grad norm: 75724.702 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7390/ 159576 | consumed samples: 300832 | elapsed time per iteration (ms): 19562.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.511443E+00 | loss scale: 2048.0 | grad norm: 89537.752 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7400/ 159576 | consumed samples: 301952 | elapsed time per iteration (ms): 19728.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.534271E+00 | loss scale: 2048.0 | grad norm: 79036.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7410/ 159576 | consumed samples: 303072 | elapsed time per iteration (ms): 19731.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.550716E+00 | loss scale: 2048.0 | grad norm: 60002.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7420/ 159576 | consumed samples: 304192 | elapsed time per iteration (ms): 19733.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.546501E+00 | loss scale: 2048.0 | grad norm: 69147.056 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7430/ 159576 | consumed samples: 305312 | elapsed time per iteration (ms): 19483.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.560014E+00 | loss scale: 2048.0 | grad norm: 75450.439 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7440/ 159576 | consumed samples: 306432 | elapsed time per iteration (ms): 19613.5 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.523249E+00 | loss scale: 2048.0 | grad norm: 104393.288 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7450/ 159576 | consumed samples: 307552 | elapsed time per iteration (ms): 19763.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.510474E+00 | loss scale: 4096.0 | grad norm: 189305.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7460/ 159576 | consumed samples: 308672 | elapsed time per iteration (ms): 19871.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.501906E+00 | loss scale: 4096.0 | grad norm: 277069.826 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7470/ 159576 | consumed samples: 309792 | elapsed time per iteration (ms): 18903.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.497433E+00 | loss scale: 4096.0 | grad norm: 225644.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7480/ 159576 | consumed samples: 310912 | elapsed time per iteration (ms): 19707.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.488033E+00 | loss scale: 4096.0 | grad norm: 230163.205 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7490/ 159576 | consumed samples: 312032 | elapsed time per iteration (ms): 19720.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.505843E+00 | loss scale: 4096.0 | grad norm: 238654.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7500/ 159576 | consumed samples: 313152 | elapsed time per iteration (ms): 18950.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.477815E+00 | loss scale: 2048.0 | grad norm: 106401.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 7500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-28 00:24:01,519] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step7500/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 7500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 17115.61 - iteration 7510/ 159576 | consumed samples: 314272 | elapsed time per iteration (ms): 21118.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.494813E+00 | loss scale: 2048.0 | grad norm: 111065.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7520/ 159576 | consumed samples: 315392 | elapsed time per iteration (ms): 19805.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.508061E+00 | loss scale: 2048.0 | grad norm: 108163.665 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 00:32:54] PULSE: tr8-104B is running for 6:49:28 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 7530/ 159576 | consumed samples: 316512 | elapsed time per iteration (ms): 19675.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.531902E+00 | loss scale: 2048.0 | grad norm: 113133.301 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7540/ 159576 | consumed samples: 317632 | elapsed time per iteration (ms): 19542.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.512622E+00 | loss scale: 2048.0 | grad norm: 124840.322 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7550/ 159576 | consumed samples: 318752 | elapsed time per iteration (ms): 19516.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.501436E+00 | loss scale: 2048.0 | grad norm: 133229.950 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7560/ 159576 | consumed samples: 319872 | elapsed time per iteration (ms): 19503.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.490542E+00 | loss scale: 2048.0 | grad norm: 71964.190 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7570/ 159576 | consumed samples: 320992 | elapsed time per iteration (ms): 19421.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.521871E+00 | loss scale: 2048.0 | grad norm: 88801.230 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7580/ 159576 | consumed samples: 322112 | elapsed time per iteration (ms): 19481.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.505743E+00 | loss scale: 2048.0 | grad norm: 284454.050 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7590/ 159576 | consumed samples: 323232 | elapsed time per iteration (ms): 19560.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.490807E+00 | loss scale: 2048.0 | grad norm: 110863.220 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7600/ 159576 | consumed samples: 324352 | elapsed time per iteration (ms): 19566.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.490352E+00 | loss scale: 2048.0 | grad norm: 99394.185 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7610/ 159576 | consumed samples: 325472 | elapsed time per iteration (ms): 19546.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.487664E+00 | loss scale: 2048.0 | grad norm: 98963.244 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7620/ 159576 | consumed samples: 326592 | elapsed time per iteration (ms): 19448.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.495935E+00 | loss scale: 2048.0 | grad norm: 80186.399 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7630/ 159576 | consumed samples: 327712 | elapsed time per iteration (ms): 19586.5 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.485136E+00 | loss scale: 2048.0 | grad norm: 90794.926 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7640/ 159576 | consumed samples: 328832 | elapsed time per iteration (ms): 19579.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.484132E+00 | loss scale: 2048.0 | grad norm: 120050.606 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7650/ 159576 | consumed samples: 329952 | elapsed time per iteration (ms): 19625.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.474982E+00 | loss scale: 2048.0 | grad norm: 132690.701 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7660/ 159576 | consumed samples: 331120 | elapsed time per iteration (ms): 19869.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.502007E+00 | loss scale: 2048.0 | grad norm: 141077.545 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7670/ 159576 | consumed samples: 332400 | elapsed time per iteration (ms): 20699.4 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.459695E+00 | loss scale: 2048.0 | grad norm: 170892.684 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7680/ 159576 | consumed samples: 333680 | elapsed time per iteration (ms): 20602.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.471451E+00 | loss scale: 2048.0 | grad norm: 186408.144 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7690/ 159576 | consumed samples: 334960 | elapsed time per iteration (ms): 20925.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.450164E+00 | loss scale: 2048.0 | grad norm: 126551.055 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7700/ 159576 | consumed samples: 336240 | elapsed time per iteration (ms): 20872.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.483758E+00 | loss scale: 2048.0 | grad norm: 113828.612 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 01:32:21] PULSE: tr8-104B is running for 7:48:55 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 7710/ 159576 | consumed samples: 337520 | elapsed time per iteration (ms): 20786.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.474139E+00 | loss scale: 2048.0 | grad norm: 92984.196 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7720/ 159576 | consumed samples: 338800 | elapsed time per iteration (ms): 20911.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.465121E+00 | loss scale: 2048.0 | grad norm: 101949.520 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7730/ 159576 | consumed samples: 340080 | elapsed time per iteration (ms): 20160.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.493755E+00 | loss scale: 1024.0 | grad norm: 47045.415 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7740/ 159576 | consumed samples: 341360 | elapsed time per iteration (ms): 20757.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.475374E+00 | loss scale: 1024.0 | grad norm: 62044.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7750/ 159576 | consumed samples: 342640 | elapsed time per iteration (ms): 20801.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.480064E+00 | loss scale: 1024.0 | grad norm: 55223.754 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7760/ 159576 | consumed samples: 343920 | elapsed time per iteration (ms): 20712.1 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.477321E+00 | loss scale: 1024.0 | grad norm: 75612.351 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7770/ 159576 | consumed samples: 345200 | elapsed time per iteration (ms): 20773.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.486430E+00 | loss scale: 1024.0 | grad norm: 57309.889 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7780/ 159576 | consumed samples: 346480 | elapsed time per iteration (ms): 20686.3 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.465924E+00 | loss scale: 1024.0 | grad norm: 78208.337 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7790/ 159576 | consumed samples: 347760 | elapsed time per iteration (ms): 20744.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.439983E+00 | loss scale: 1024.0 | grad norm: 85978.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7800/ 159576 | consumed samples: 349040 | elapsed time per iteration (ms): 20858.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.466323E+00 | loss scale: 1024.0 | grad norm: 83254.794 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7810/ 159576 | consumed samples: 350320 | elapsed time per iteration (ms): 20728.1 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.452026E+00 | loss scale: 1024.0 | grad norm: 82300.274 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7820/ 159576 | consumed samples: 351600 | elapsed time per iteration (ms): 20746.4 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.471143E+00 | loss scale: 1024.0 | grad norm: 70196.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7830/ 159576 | consumed samples: 352880 | elapsed time per iteration (ms): 20801.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.484294E+00 | loss scale: 1024.0 | grad norm: 52460.842 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7840/ 159576 | consumed samples: 354160 | elapsed time per iteration (ms): 20885.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.492403E+00 | loss scale: 1024.0 | grad norm: 61833.655 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7850/ 159576 | consumed samples: 355440 | elapsed time per iteration (ms): 20657.1 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.466279E+00 | loss scale: 1024.0 | grad norm: 62285.100 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7860/ 159576 | consumed samples: 356720 | elapsed time per iteration (ms): 19964.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.448762E+00 | loss scale: 512.0 | grad norm: 76192.061 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7870/ 159576 | consumed samples: 358000 | elapsed time per iteration (ms): 20780.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.468709E+00 | loss scale: 512.0 | grad norm: 27166.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7880/ 159576 | consumed samples: 359280 | elapsed time per iteration (ms): 20507.3 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.619281E+00 | loss scale: 512.0 | grad norm: 27451.209 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 02:32:25] PULSE: tr8-104B is scheduled to start in 17:52:43 (at 2021-09-28T20:25:09) (1277218 on 'gpu_p13' partition) -[2021-09-28 02:32:25] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1277216 on 'gpu_p13' partition) -[2021-09-28 02:32:25] PULSE: tr8-104B is running for 8:48:59 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 7890/ 159576 | consumed samples: 360560 | elapsed time per iteration (ms): 20685.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.639037E+00 | loss scale: 512.0 | grad norm: 21160.659 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7900/ 159576 | consumed samples: 361840 | elapsed time per iteration (ms): 20486.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.220924E+00 | loss scale: 512.0 | grad norm: 53815.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7910/ 159576 | consumed samples: 363120 | elapsed time per iteration (ms): 20468.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.521174E+00 | loss scale: 512.0 | grad norm: 36754.779 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7920/ 159576 | consumed samples: 364400 | elapsed time per iteration (ms): 20813.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.992101E+00 | loss scale: 512.0 | grad norm: 30259.595 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7930/ 159576 | consumed samples: 365680 | elapsed time per iteration (ms): 20655.4 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.722378E+00 | loss scale: 512.0 | grad norm: 10480.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7940/ 159576 | consumed samples: 366960 | elapsed time per iteration (ms): 20950.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.599706E+00 | loss scale: 512.0 | grad norm: 6665.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7950/ 159576 | consumed samples: 368240 | elapsed time per iteration (ms): 20603.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.160472E+00 | loss scale: 512.0 | grad norm: 8111.009 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7960/ 159576 | consumed samples: 369520 | elapsed time per iteration (ms): 20624.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.089828E+00 | loss scale: 512.0 | grad norm: 6778.536 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7970/ 159576 | consumed samples: 370800 | elapsed time per iteration (ms): 20691.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.228701E+00 | loss scale: 512.0 | grad norm: 20747.579 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7980/ 159576 | consumed samples: 372080 | elapsed time per iteration (ms): 20583.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.185581E+00 | loss scale: 512.0 | grad norm: 12240.877 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7990/ 159576 | consumed samples: 373360 | elapsed time per iteration (ms): 20386.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.279394E+00 | loss scale: 512.0 | grad norm: 11910.373 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 03:13:26,386] [INFO] [logging.py:68:log_dist] [Rank 0] step=8000, skipped=20, lr=[5.99997953854461e-05, 5.99997953854461e-05], mom=[(0.9, 0.999), (0.9, 0.999)] - iteration 8000/ 159576 | consumed samples: 374640 | elapsed time per iteration (ms): 20647.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.117274E+00 | loss scale: 512.0 | grad norm: 6363.554 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -steps: 8000 loss: 6.9948 iter time (s): 0.010 samples/sec: 12373.041 ------------------------------------------------------------------------------------------------- - validation loss at iteration 8000 | lm loss value: 6.930026E+00 | lm loss PPL: 1.022520E+03 | ------------------------------------------------------------------------------------------------- - iteration 8010/ 159576 | consumed samples: 375920 | elapsed time per iteration (ms): 22250.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.907596E+00 | loss scale: 512.0 | grad norm: 5175.818 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8020/ 159576 | consumed samples: 377200 | elapsed time per iteration (ms): 20702.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.903972E+00 | loss scale: 512.0 | grad norm: 8915.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8030/ 159576 | consumed samples: 378544 | elapsed time per iteration (ms): 21181.5 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.942516E+00 | loss scale: 512.0 | grad norm: 8113.065 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8040/ 159576 | consumed samples: 379984 | elapsed time per iteration (ms): 21914.5 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.923864E+00 | loss scale: 512.0 | grad norm: 19249.730 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8050/ 159576 | consumed samples: 381424 | elapsed time per iteration (ms): 21865.5 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.876669E+00 | loss scale: 512.0 | grad norm: 7890.746 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 03:32:27] PULSE: tr8-104B is scheduled to start in 19:12:32 (at 2021-09-28T22:45:00) (1277218 on 'gpu_p13' partition) -[2021-09-28 03:32:27] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1277295_[1-10%1] on 'gpu_p13' partition) -[2021-09-28 03:32:27] PULSE: tr8-104B is running for 9:49:01 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 8060/ 159576 | consumed samples: 382864 | elapsed time per iteration (ms): 21779.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.788055E+00 | loss scale: 512.0 | grad norm: 9618.538 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8070/ 159576 | consumed samples: 384304 | elapsed time per iteration (ms): 21643.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.808229E+00 | loss scale: 512.0 | grad norm: 8857.044 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8080/ 159576 | consumed samples: 385744 | elapsed time per iteration (ms): 21639.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.901846E+00 | loss scale: 512.0 | grad norm: 8983.602 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8090/ 159576 | consumed samples: 387184 | elapsed time per iteration (ms): 22052.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.863363E+00 | loss scale: 512.0 | grad norm: 9399.920 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8100/ 159576 | consumed samples: 388624 | elapsed time per iteration (ms): 21866.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.843295E+00 | loss scale: 512.0 | grad norm: 8690.802 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8110/ 159576 | consumed samples: 390064 | elapsed time per iteration (ms): 21853.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.893594E+00 | loss scale: 512.0 | grad norm: 13780.366 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8120/ 159576 | consumed samples: 391504 | elapsed time per iteration (ms): 21812.6 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.924708E+00 | loss scale: 512.0 | grad norm: 7097.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8130/ 159576 | consumed samples: 392944 | elapsed time per iteration (ms): 21586.9 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.829758E+00 | loss scale: 512.0 | grad norm: 7266.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8140/ 159576 | consumed samples: 394384 | elapsed time per iteration (ms): 21935.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.820535E+00 | loss scale: 512.0 | grad norm: 7758.235 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8150/ 159576 | consumed samples: 395824 | elapsed time per iteration (ms): 21921.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.822125E+00 | loss scale: 512.0 | grad norm: 6965.512 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8160/ 159576 | consumed samples: 397264 | elapsed time per iteration (ms): 21703.6 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.756792E+00 | loss scale: 512.0 | grad norm: 9871.280 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8170/ 159576 | consumed samples: 398704 | elapsed time per iteration (ms): 21847.9 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.773450E+00 | loss scale: 512.0 | grad norm: 12746.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8180/ 159576 | consumed samples: 400144 | elapsed time per iteration (ms): 21833.8 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.785934E+00 | loss scale: 512.0 | grad norm: 5598.866 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8190/ 159576 | consumed samples: 401584 | elapsed time per iteration (ms): 21797.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.870234E+00 | loss scale: 512.0 | grad norm: 6782.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8200/ 159576 | consumed samples: 403024 | elapsed time per iteration (ms): 21810.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.838039E+00 | loss scale: 512.0 | grad norm: 9577.527 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8210/ 159576 | consumed samples: 404464 | elapsed time per iteration (ms): 21905.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.807652E+00 | loss scale: 512.0 | grad norm: 11918.248 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 04:33:02] PULSE: tr8-104B is scheduled to start in 18:11:57 (at 2021-09-28T22:45:00) (1277218 on 'gpu_p13' partition) -[2021-09-28 04:33:02] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1277295_[1-10%1] on 'gpu_p13' partition) -[2021-09-28 04:33:02] PULSE: tr8-104B is running for 10:49:36 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 8220/ 159576 | consumed samples: 405904 | elapsed time per iteration (ms): 21977.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.819595E+00 | loss scale: 512.0 | grad norm: 6882.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8230/ 159576 | consumed samples: 407344 | elapsed time per iteration (ms): 21630.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.880849E+00 | loss scale: 512.0 | grad norm: 17414.946 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8240/ 159576 | consumed samples: 408784 | elapsed time per iteration (ms): 21894.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.930541E+00 | loss scale: 512.0 | grad norm: 7836.035 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8250/ 159576 | consumed samples: 410224 | elapsed time per iteration (ms): 21731.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.906449E+00 | loss scale: 512.0 | grad norm: 7978.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8260/ 159576 | consumed samples: 411664 | elapsed time per iteration (ms): 21776.5 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.893109E+00 | loss scale: 512.0 | grad norm: 9114.270 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8270/ 159576 | consumed samples: 413104 | elapsed time per iteration (ms): 22166.2 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.885992E+00 | loss scale: 512.0 | grad norm: 13085.411 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8280/ 159576 | consumed samples: 414544 | elapsed time per iteration (ms): 21762.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.789729E+00 | loss scale: 512.0 | grad norm: 11443.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8290/ 159576 | consumed samples: 415984 | elapsed time per iteration (ms): 21743.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.784861E+00 | loss scale: 512.0 | grad norm: 10437.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8300/ 159576 | consumed samples: 417424 | elapsed time per iteration (ms): 21878.0 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.831153E+00 | loss scale: 512.0 | grad norm: 6842.857 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8310/ 159576 | consumed samples: 418864 | elapsed time per iteration (ms): 21680.7 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.847891E+00 | loss scale: 512.0 | grad norm: 8236.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8320/ 159576 | consumed samples: 420304 | elapsed time per iteration (ms): 21650.4 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.831273E+00 | loss scale: 512.0 | grad norm: 10757.345 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8330/ 159576 | consumed samples: 421744 | elapsed time per iteration (ms): 21761.1 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.866577E+00 | loss scale: 512.0 | grad norm: 9414.173 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8340/ 159576 | consumed samples: 423184 | elapsed time per iteration (ms): 22000.3 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 6.927114E+00 | loss scale: 512.0 | grad norm: 22264.468 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8350/ 159576 | consumed samples: 424624 | elapsed time per iteration (ms): 21732.0 | learning rate: 6.000E-05 | global batch size: 144 | lm loss: 7.098891E+00 | loss scale: 512.0 | grad norm: 10280.295 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8360/ 159576 | consumed samples: 426160 | elapsed time per iteration (ms): 22517.6 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.958164E+00 | loss scale: 1024.0 | grad norm: 13178.434 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8370/ 159576 | consumed samples: 427760 | elapsed time per iteration (ms): 23182.1 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.889060E+00 | loss scale: 1024.0 | grad norm: 18842.234 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8380/ 159576 | consumed samples: 429360 | elapsed time per iteration (ms): 23097.1 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.878168E+00 | loss scale: 1024.0 | grad norm: 18421.706 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 05:32:36] PULSE: tr8-104B is scheduled to start in 17:12:23 (at 2021-09-28T22:45:00) (1277218 on 'gpu_p13' partition) -[2021-09-28 05:32:36] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1277295_[1-10%1] on 'gpu_p13' partition) -[2021-09-28 05:32:36] PULSE: tr8-104B is running for 11:49:10 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 8390/ 159576 | consumed samples: 430960 | elapsed time per iteration (ms): 22911.1 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.836983E+00 | loss scale: 1024.0 | grad norm: 21055.325 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8400/ 159576 | consumed samples: 432560 | elapsed time per iteration (ms): 23311.7 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.867126E+00 | loss scale: 1024.0 | grad norm: 13309.684 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8410/ 159576 | consumed samples: 434160 | elapsed time per iteration (ms): 22945.0 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.896465E+00 | loss scale: 1024.0 | grad norm: 24249.264 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8420/ 159576 | consumed samples: 435760 | elapsed time per iteration (ms): 22797.0 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.923830E+00 | loss scale: 1024.0 | grad norm: 16621.010 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8430/ 159576 | consumed samples: 437360 | elapsed time per iteration (ms): 23019.9 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.940806E+00 | loss scale: 1024.0 | grad norm: 15050.371 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8440/ 159576 | consumed samples: 438960 | elapsed time per iteration (ms): 23026.2 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.984757E+00 | loss scale: 1024.0 | grad norm: 22968.730 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8450/ 159576 | consumed samples: 440560 | elapsed time per iteration (ms): 22903.0 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.970832E+00 | loss scale: 1024.0 | grad norm: 25206.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8460/ 159576 | consumed samples: 442160 | elapsed time per iteration (ms): 22992.7 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 6.992513E+00 | loss scale: 1024.0 | grad norm: 9219.678 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8470/ 159576 | consumed samples: 443760 | elapsed time per iteration (ms): 23036.6 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.053975E+00 | loss scale: 1024.0 | grad norm: 9743.104 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8480/ 159576 | consumed samples: 445360 | elapsed time per iteration (ms): 22710.5 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.087634E+00 | loss scale: 1024.0 | grad norm: 36403.836 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8490/ 159576 | consumed samples: 446960 | elapsed time per iteration (ms): 22994.9 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.142048E+00 | loss scale: 1024.0 | grad norm: 8807.945 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8500/ 159576 | consumed samples: 448560 | elapsed time per iteration (ms): 22707.3 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.160313E+00 | loss scale: 1024.0 | grad norm: 9148.356 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8510/ 159576 | consumed samples: 450160 | elapsed time per iteration (ms): 22963.9 | learning rate: 6.000E-05 | global batch size: 160 | lm loss: 7.277474E+00 | loss scale: 1024.0 | grad norm: 43448.626 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8520/ 159576 | consumed samples: 451760 | elapsed time per iteration (ms): 19193.8 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 64.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8530/ 159576 | consumed samples: 453360 | elapsed time per iteration (ms): 15554.5 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8540/ 159576 | consumed samples: 454960 | elapsed time per iteration (ms): 15434.8 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8550/ 159576 | consumed samples: 456560 | elapsed time per iteration (ms): 15729.0 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 06:32:50] PULSE: tr8-104B is scheduled to start in 17:29:26 (at 2021-09-29T00:02:17) (1277218 on 'gpu_p13' partition) -[2021-09-28 06:32:50] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1277295_[1-10%1] on 'gpu_p13' partition) -[2021-09-28 06:32:50] PULSE: tr8-104B is running for 12:49:24 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 8560/ 159576 | consumed samples: 458160 | elapsed time per iteration (ms): 15526.6 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8570/ 159576 | consumed samples: 459760 | elapsed time per iteration (ms): 15343.9 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8580/ 159576 | consumed samples: 461360 | elapsed time per iteration (ms): 15516.0 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8590/ 159576 | consumed samples: 462960 | elapsed time per iteration (ms): 15788.5 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8600/ 159576 | consumed samples: 464560 | elapsed time per iteration (ms): 15421.5 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8610/ 159576 | consumed samples: 466160 | elapsed time per iteration (ms): 15365.4 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8620/ 159576 | consumed samples: 467760 | elapsed time per iteration (ms): 15460.6 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8630/ 159576 | consumed samples: 469360 | elapsed time per iteration (ms): 15794.2 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8640/ 159576 | consumed samples: 470960 | elapsed time per iteration (ms): 15928.5 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8650/ 159576 | consumed samples: 472560 | elapsed time per iteration (ms): 15514.8 | learning rate: 6.000E-05 | global batch size: 160 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8660/ 159576 | consumed samples: 474320 | elapsed time per iteration (ms): 16639.1 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8670/ 159576 | consumed samples: 476080 | elapsed time per iteration (ms): 16569.6 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8680/ 159576 | consumed samples: 477840 | elapsed time per iteration (ms): 16695.6 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8690/ 159576 | consumed samples: 479600 | elapsed time per iteration (ms): 16700.3 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8700/ 159576 | consumed samples: 481360 | elapsed time per iteration (ms): 16569.3 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8710/ 159576 | consumed samples: 483120 | elapsed time per iteration (ms): 16526.6 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8720/ 159576 | consumed samples: 484880 | elapsed time per iteration (ms): 16370.8 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8730/ 159576 | consumed samples: 486640 | elapsed time per iteration (ms): 16678.1 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8740/ 159576 | consumed samples: 488400 | elapsed time per iteration (ms): 16715.4 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8750/ 159576 | consumed samples: 490160 | elapsed time per iteration (ms): 16605.2 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8760/ 159576 | consumed samples: 491920 | elapsed time per iteration (ms): 16522.8 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8770/ 159576 | consumed samples: 493680 | elapsed time per iteration (ms): 16607.3 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-28 07:32:48] PULSE: tr8-104B is scheduled to start in 17:38:05 (at 2021-09-29T01:10:54) (1277218 on 'gpu_p13' partition) -[2021-09-28 07:32:48] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1277295_[1-10%1] on 'gpu_p13' partition) -[2021-09-28 07:32:48] PULSE: tr8-104B is running for 13:49:22 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6]) - iteration 8780/ 159576 | consumed samples: 495440 | elapsed time per iteration (ms): 16798.5 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8790/ 159576 | consumed samples: 497200 | elapsed time per iteration (ms): 16594.8 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 8800/ 159576 | consumed samples: 498960 | elapsed time per iteration (ms): 16863.3 | learning rate: 6.000E-05 | global batch size: 176 | loss scale: 1.0 | grad norm: 5533.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -srun: Job step aborted: Waiting up to 62 seconds for job step to finish. -Killing subprocess 30115 -Killing subprocess 30116 -Killing subprocess 72376 -Killing subprocess 30117 -Killing subprocess 72377 -Killing subprocess 72378 -Killing subprocess 30118 -Main process received SIGTERM, exiting -Killing subprocess 72380 -Killing subprocess 14784 -Killing subprocess 14785 -Killing subprocess 13422 -Killing subprocess 14786 -Killing subprocess 55737 -Killing subprocess 14788 -Killing subprocess 70412 -Main process received SIGTERM, exiting -Killing subprocess 16940 -Killing subprocess 72459 -Killing subprocess 13423 -Killing subprocess 74871 -Killing subprocess 55738 -Killing subprocess 29874 -Killing subprocess 66501 -Killing subprocess 16941 -Killing subprocess 16942 -Killing subprocess 16943 -Killing subprocess 16970 -Killing subprocess 70413 -Killing subprocess 72867 -Killing subprocess 13424 -Killing subprocess 29875 -Killing subprocess 13425 -Main process received SIGTERM, exiting -Killing subprocess 74872 -Killing subprocess 13332 -Killing subprocess 38577 -Killing subprocess 60665 -Killing subprocess 59238 -Killing subprocess 59239 -Killing subprocess 55739 -Killing subprocess 71579 -Killing subprocess 55740 -Killing subprocess 13333 -Killing subprocess 70414 -Killing subprocess 72868 -Killing subprocess 70416 -Killing subprocess 33635 -Killing subprocess 74873 -Killing subprocess 16971 -Killing subprocess 59240 -Killing subprocess 29876 -Killing subprocess 72869 -Killing subprocess 4131 -Killing subprocess 31723 -Killing subprocess 29877 -Killing subprocess 70249 -Main process received SIGTERM, exiting -Killing subprocess 71580 -Killing subprocess 33197 -Killing subprocess 33198 -Killing subprocess 33199 -Killing subprocess 16972 -Killing subprocess 13334 -Killing subprocess 37375 -Killing subprocess 31519 -Killing subprocess 60666 -Killing subprocess 60928 -Killing subprocess 5189 -Killing subprocess 71748 -Killing subprocess 60667 -Killing subprocess 59241 -Main process received SIGTERM, exiting -Killing subprocess 52958 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 71581 -Main process received SIGTERM, exiting -Killing subprocess 76865 -Killing subprocess 72870 -Killing subprocess 4132 -Killing subprocess 60668 -Killing subprocess 31520 -Main process received SIGTERM, exiting -Killing subprocess 38578 -Killing subprocess 74874 -Killing subprocess 16973 -Killing subprocess 76175 -Main process received SIGTERM, exiting -Killing subprocess 37376 -Killing subprocess 60929 -Main process received SIGTERM, exiting -Killing subprocess 72460 -Killing subprocess 52959 -Killing subprocess 66400 -Killing subprocess 33636 -Killing subprocess 5190 -Killing subprocess 76176 -Killing subprocess 73489 -Killing subprocess 72461 -Killing subprocess 13335 -Killing subprocess 38579 -Killing subprocess 76866 -Main process received SIGTERM, exiting -Killing subprocess 6862 -Killing subprocess 52960 -Killing subprocess 38580 -Killing subprocess 76177 -Killing subprocess 31521 -Killing subprocess 60930 -Main process received SIGTERM, exiting -Killing subprocess 33637 -slurmstepd: error: *** STEP 1271196.0 ON r7i7n6 CANCELLED AT 2021-09-28T07:42:47 *** -Killing subprocess 14888 -Killing subprocess 71582 -Killing subprocess 31522 -Killing subprocess 72462 -Killing subprocess 70250 -Killing subprocess 33639 -Killing subprocess 5191 -Killing subprocess 76178 -Killing subprocess 76867 -Killing subprocess 73490 -Killing subprocess 8322 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 5192 -Killing subprocess 71749 -Killing subprocess 66401 -Killing subprocess 70251 -Killing subprocess 31724 -Killing subprocess 23140 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 76869 -Killing subprocess 24195 -Killing subprocess 3669 -Killing subprocess 14889 -Killing subprocess 6863 -Killing subprocess 73491 -Killing subprocess 4133 -Killing subprocess 70253 -Killing subprocess 31725 -Killing subprocess 14890 -Killing subprocess 52961 -Killing subprocess 66402 -Killing subprocess 57345 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 66403 -Killing subprocess 79017 -Killing subprocess 5022 -Killing subprocess 26301 -Killing subprocess 71750 -Main process received SIGTERM, exiting -Killing subprocess 23141 -Killing subprocess 66502 -Killing subprocess 2542 -Killing subprocess 37377 -Killing subprocess 32138 -Killing subprocess 62368 -Killing subprocess 4134 -Killing subprocess 33200 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 79018 -Killing subprocess 62803 -Killing subprocess 62804 -Killing subprocess 62805 -Killing subprocess 42235 -Killing subprocess 1224 -Killing subprocess 31687 -Killing subprocess 65257 -Main process received SIGTERM, exiting -Killing subprocess 54282 -Killing subprocess 2543 -Killing subprocess 79019 -Killing subprocess 42236 -Killing subprocess 42237 -Killing subprocess 36949 -Killing subprocess 62369 -Killing subprocess 23142 -Killing subprocess 66503 -Killing subprocess 3670 -Main process received SIGTERM, exiting -Killing subprocess 2544 -Killing subprocess 7298 -Killing subprocess 37378 -Killing subprocess 73492 -Killing subprocess 42238 -Killing subprocess 31688 -Killing subprocess 31689 -Killing subprocess 31690 -Killing subprocess 66505 -Killing subprocess 2546 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 26302 -Killing subprocess 39557 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 78372 -Killing subprocess 27460 -Killing subprocess 62806 -Killing subprocess 8323 -Killing subprocess 24196 -Killing subprocess 1225 -Killing subprocess 23143 -Killing subprocess 3671 -Killing subprocess 54283 -Killing subprocess 14892 -Killing subprocess 7299 -Killing subprocess 71751 -Killing subprocess 5023 -Killing subprocess 78860 -Main process received SIGTERM, exiting -Killing subprocess 24197 -Killing subprocess 57346 -Main process received SIGTERM, exiting -Killing subprocess 7300 -Main process received SIGTERM, exiting -Killing subprocess 78861 -Killing subprocess 32139 -Main process received SIGTERM, exiting -Killing subprocess 36950 -Killing subprocess 1226 -Killing subprocess 26303 -Main process received SIGTERM, exiting -Killing subprocess 54284 -Killing subprocess 5024 -Killing subprocess 57347 -Killing subprocess 26304 -Killing subprocess 57348 -Main process received SIGTERM, exiting -Killing subprocess 78373 -Killing subprocess 27461 -Killing subprocess 8324 -Killing subprocess 24198 -Killing subprocess 3672 -Killing subprocess 78374 -Killing subprocess 54286 -Killing subprocess 78862 -Killing subprocess 32140 -Killing subprocess 8325 -Main process received SIGTERM, exiting -Killing subprocess 36951 -Killing subprocess 1227 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 78375 -Killing subprocess 32141 -Main process received SIGTERM, exiting -Killing subprocess 36952 -Killing subprocess 7301 -Killing subprocess 78863 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 31726 -Main process received SIGTERM, exiting -Killing subprocess 7871 -Killing subprocess 62370 -Killing subprocess 60931 -Main process received SIGTERM, exiting -Killing subprocess 79020 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 7872 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 65258 -Main process received SIGTERM, exiting -Killing subprocess 22589 -Killing subprocess 62372 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 5025 -Main process received SIGTERM, exiting -Killing subprocess 33581 -Killing subprocess 7873 -Main process received SIGTERM, exiting -Killing subprocess 66867 -Main process received SIGTERM, exiting -Killing subprocess 7875 -Killing subprocess 65259 -Killing subprocess 65260 -Main process received SIGTERM, exiting -Killing subprocess 22590 -Killing subprocess 22591 -Killing subprocess 66868 -Killing subprocess 22592 -Main process received SIGTERM, exiting -Killing subprocess 33582 -Killing subprocess 66869 -Killing subprocess 33583 -Killing subprocess 6864 -Killing subprocess 27462 -Main process received SIGTERM, exiting -Killing subprocess 23047 -Killing subprocess 6865 -Killing subprocess 27463 -Killing subprocess 66871 -Main process received SIGTERM, exiting -Killing subprocess 43155 -Main process received SIGTERM, exiting -Main process received SIGTERM, exiting -Killing subprocess 33585 -Main process received SIGTERM, exiting -Killing subprocess 43156 -Killing subprocess 43157 -Killing subprocess 39558 -Killing subprocess 23048 -Killing subprocess 23049 -Killing subprocess 23050 -Killing subprocess 43159 -Main process received SIGTERM, exiting -Killing subprocess 39559 -Main process received SIGTERM, exiting -Killing subprocess 39560 -Main process received SIGTERM, exiting -[2021-09-28 08:32:52] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -[2021-09-28 09:33:05] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -[2021-09-28 10:33:03] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -[2021-09-28 11:33:17] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/code/tr8-104B/bigscience/tools/slurm-status.py", line 177, in - main() - File "/gpfswork/rech/six/commun/code/tr8-104B/bigscience/tools/slurm-status.py", line 172, in main - send_email_alert_job_not_scheduled(args.job_name) - File "/gpfswork/rech/six/commun/code/tr8-104B/bigscience/tools/slurm-status.py", line 61, in send_email_alert_job_not_scheduled - send_email(subject, body) - File "/gpfswork/rech/six/commun/code/tr8-104B/bigscience/tools/slurm-status.py", line 39, in send_email - server = smtplib.SMTP("localhost") - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/smtplib.py", line 251, in __init__ - (code, msg) = self.connect(host, port) - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/smtplib.py", line 336, in connect - self.sock = self._get_socket(host, port, self.timeout) - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/smtplib.py", line 307, in _get_socket - self.source_address) - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/socket.py", line 728, in create_connection - raise err - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/socket.py", line 716, in create_connection - sock.connect(sa) -ConnectionRefusedError: [Errno 111] Connection refused -[2021-09-28 12:33:29] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -[2021-09-28 13:33:44] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -[2021-09-28 14:34:11] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -[2021-09-28 15:33:54] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -[2021-09-28 16:34:11] PULSE: ***ALERT: tr8-104B is not RUNNING or SCHEDULED! Alert someone at Eng WG*** -Traceback (most recent call last): - File "/gpfswork/rech/six/commun/code/tr8-104B/bigscience/tools/slurm-status.py", line 177, in - main() - File "/gpfswork/rech/six/commun/code/tr8-104B/bigscience/tools/slurm-status.py", line 172, in main - send_email_alert_job_not_scheduled(args.job_name) - File "/gpfswork/rech/six/commun/code/tr8-104B/bigscience/tools/slurm-status.py", line 61, in send_email_alert_job_not_scheduled - send_email(subject, body) - File "/gpfswork/rech/six/commun/code/tr8-104B/bigscience/tools/slurm-status.py", line 39, in send_email - server = smtplib.SMTP("localhost") - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/smtplib.py", line 251, in __init__ - (code, msg) = self.connect(host, port) - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/smtplib.py", line 336, in connect - self.sock = self._get_socket(host, port, self.timeout) - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/smtplib.py", line 307, in _get_socket - self.source_address) - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/socket.py", line 728, in create_connection - raise err - File "/gpfslocalsup/pub/anaconda-py3/2020.02/lib/python3.7/socket.py", line 716, in create_connection - sock.connect(sa) -ConnectionRefusedError: [Errno 111] Connection refused -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** -***************************************** -Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. -***************************************** --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja-------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja --------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. ...................................................... [OKAY][OKAY] [OKAY] -ninjaninja .................................... [OKAY] -[OKAY] --------------------------------------------------- - -[OKAY] --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------op name ----------------------------------------------------------------------------------------------------- - -op name - op name................ installed................ .. compatibleinstalled - --------------------------------------------------.. - op nameop nameop name ................ ................ ................................ installed installedinstalled ....installed .. compatible compatible -compatible.. - compatible --------------------------------------------------- - - ---------------------------------------------------------------------------------------------------- -compatible-------------------------------------------------- - - -cpu_adam ............... [YES] ......cpu_adam ...............[OKAY] --------------------------------------------------- - [YES] ...... [OKAY] -cpu_adam ...............cpu_adam [YES]cpu_adamcpu_adam ............... ...... [YES]............... [OKAY]............... ...... -fused_adam ............. [NO] fused_adam....... [OKAY]............. -[YES] [OKAY] [YES] - ............ [OKAY][OKAY]fused_adam - [NO] ....... fused_lamb[OKAY] - -............. [NO] .......fused_lamb [OKAY] - fused_adam............. ............. [NO][NO] fused_adam.............. .............[OKAY][OKAY] fused_adam - -............. [NO] ....... [OKAY] -[NO] ....................fused_lambfused_lamb [OKAY]............. -sparse_attn ............sparse_attn [NO] ................... [NO][OKAY] -[NO].............[NO] .......[NO]fused_lamb .......[OKAY]............. -....... [OKAY][NO] -[OKAY] - ....... transformer[OKAY] -....... [OKAY] -............ [NO] transformer....... ............[OKAY] -fused_lamb ............. [NO] .......sparse_attn ............ sparse_attn[NO]sparse_attn[OKAY] ............................... -[NO][OKAY][NO] -[NO] ....... [OKAY]stochastic_transformer - .............. [OKAY][OKAY] -transformer - . stochastic_transformer[NO] ....... .[OKAY] -[NO] ....... [OKAY] - ............transformer transformer[NO]sparse_attn ............ ............................... [OKAY][NO][NO] - .......[NO].......stochastic_transformer [OKAY] -[OKAY]........ - [NO]stochastic_transformer [OKAY]stochastic_transformer....... - .[OKAY] transformer. -[NO] [NO]....... ............[OKAY]....... - [NO][OKAY] -....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninja ninja ...................................................... ..................[OKAY][OKAY][OKAY] - - -[OKAY]------------------------------------------------------------------------------------------------------------------------------------------------------ - - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -op nameop nameop name -------------------------------------------------- ................ ................ - ................ installedop name installed installed .................. compatible.. - ..compatibleinstalled -------------------------------------------------- - compatible ---------------------------------------------------.. - - --------------------------------------------------compatible - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------JIT compiled ops requires ninja - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... ..................[OKAY][OKAY][OKAY] - - --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -[OKAY]------------------------------------------------------------------------------------------------------------------------------------------------------ - - - -cpu_adamcpu_adam ..............................cpu_adam [YES] cpu_adam[YES] ............... ...... .....................[YES][OKAY] -......[YES][OKAY] -[OKAY]...... --------------------------------------------------- -JIT compiled ops requires ninja -op nameop name--------------------------------------------------op name - [OKAY]fused_adam - ............. [NO] ....... [OKAY]fused_adamfused_adam - ................................................ op name installed installedinstalled ...................... installedcompatiblecompatible - -compatible..---------------------------------------------------------------------------------------------------- - -compatible --------------------------------------------------- - - ..........................fused_adam fused_lamb [NO] [NO]............. ............. ....... .......[NO] [OKAY][NO] [OKAY] -....... ---------------------------------------------------cpu_adam - .......[OKAY]fused_lamb - fused_lamb[OKAY]............. -cpu_adam............... ...............[YES] cpu_adam......[YES] [OKAY]............... -...... [YES][OKAY] - ...... [OKAY]cpu_adam - .............[NO] fused_lamb[NO]....... ....................sparse_attn [OKAY] -[OKAY]............[NO] -fused_adam ............................ [NO] .......fused_adam [OKAY]............. - [NO]....... ....... [OKAY][OKAY] - - [NO]fused_adam fused_lamb....... ..........................[OKAY][YES] -sparse_attn ............transformer sparse_attn [NO] ............ ............ [NO] ....... sparse_attn[NO].......[OKAY] -...................[OKAY] - [NO]fused_lamb [NO] ............. .................... [NO][OKAY][OKAY] -.......[OKAY] - -[NO][OKAY] transformer -transformer ................... [OKAY]stochastic_transformer[NO]............ -[OKAY] - ........[NO] transformer [OKAY][NO]....... -fused_lamb ............. [NO] ....... [OKAY] - ................... [OKAY]stochastic_transformer[OKAY] - -sparse_attn ............ [NO] fused_adamsparse_attn....... ............[OKAY] [NO] -............. ....... [OKAY] -[NO] ........stochastic_transformer [NO][OKAY] -........ [OKAY][NO]stochastic_transformer - ....... [OKAY]. -transformersparse_attn ............transformer[NO] ............ [NO]............ .......[NO] [OKAY]....... - [NO] ....... [OKAY] - [OKAY] -.......[NO] stochastic_transformer stochastic_transformer ....... . .[OKAY][OKAY][NO] - - transformer[NO]....... .......[OKAY]............ -[OKAY] -[NO]fused_lamb ....... [OKAY] -............. stochastic_transformer[NO] . .......[NO] ....... [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY] -[OKAY][OKAY][OKAY] --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- -op name-------------------------------------------------- - -op name................op nameop name ................................ installed................installed ....installedinstalled compatible..compatible.. - - compatible----------------------------------------------------------------------------------------------------compatible - - --------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adamcpu_adam ...............cpu_adam............... ..............................[YES][YES] ......[YES][YES]...... [OKAY]...... -[OKAY]...... -[OKAY] -[OKAY] -fused_adam .............fused_adam [NO]............. fused_adamfused_adam ....... [NO] .......................... [OKAY]....... [NO] - [NO] [OKAY] ....... -.......fused_lamb [OKAY]fused_lamb[OKAY] -............. - .............[NO] fused_lamb [NO] ....... fused_lamb.............[OKAY]....... - .............[OKAY] [NO] -[NO] .............. [OKAY][OKAY] - -sparse_attn ............ sparse_attn[NO] ................... [NO][OKAY] -.......sparse_attnsparse_attn [OKAY]............transformer............ - ............[NO] [NO][NO] ..............transformer....... [OKAY]............[OKAY][OKAY] - - -[NO] .......transformertransformer stochastic_transformer ............[OKAY] ............ - .[NO][NO] stochastic_transformer[NO].............. ........[OKAY][OKAY] - - [NO][OKAY] -....... stochastic_transformerstochastic_transformer[OKAY] -.. [NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY] -[OKAY][OKAY] --------------------------------------------------- - - ---------------------------------------------------op name ----------------------------------------------------------------------------------------------------- op name................ - - installedop name................op name ..installed ................................compatible -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name -..installedinstalled-------------------------------------------------- .... -compatible -compatible ----------------------------------------------------------------------------------------------------- - - compatiblecpu_adam - ...............-------------------------------------------------- -op name op name op name................................ installed................installed................ ....installed installed compatible compatible.. - -.. ----------------------------------------------------------------------------------------------------compatible - - -compatible ----------------------------------------------------------------------------------------------------- - -[YES] cpu_adam...... cpu_adam ............... [OKAY] ............... -[YES] [YES]...... ......[OKAY] -cpu_adamcpu_adam .............................. [YES][YES]cpu_adam cpu_adam ...... ...... [OKAY]............... -cpu_adam[OKAY] - [YES][OKAY]............... - ......[YES] [OKAY]...... - [OKAY]fused_adam -............... fused_adam ............. [YES][NO] .............fused_adam fused_adam [OKAY][OKAY] -.......................... - ............. fused_adam[NO] .................... [NO][OKAY]fused_adam - [NO][NO]fused_lamb ........................... [NO][OKAY][OKAY] - -....... [OKAY]fused_lambfused_lamb -....... fused_adam [OKAY]............. fused_lamb - .......................... [NO][NO] .............. [OKAY][OKAY] - -............. [NO] ............. fused_lamb[NO] .................... .......[NO] [NO] [OKAY].......[OKAY]....... - -fused_adam ............. [NO]sparse_attn ............ [NO] .............. sparse_attn [OKAY]sparse_attn[OKAY] ........................ - - [OKAY][OKAY] - -fused_lambfused_lamb .......................... [NO][NO] .............. [OKAY][OKAY] - -[NO][NO]transformer .......................... [OKAY][NO][OKAY] - fused_lamb -sparse_attnsparse_attn ........................ [NO][NO] ....... .......[OKAY] sparse_attn -[OKAY]sparse_attn -....... transformertransformer ............. [OKAY] ............ -[NO]............ [NO].......[NO] stochastic_transformer ....... ....... .[OKAY] [OKAY] -[OKAY][NO] - transformer............ ............ ............transformer[NO] [NO] ............ [NO] .............. [NO][OKAY] .......[OKAY] - -.......[OKAY] - -.......stochastic_transformer [OKAY]stochastic_transformer -stochastic_transformer[OKAY]transformer - . [NO]. .......[NO] [OKAY]....... - [OKAY] - ............. transformer [NO]stochastic_transformer[NO] ........................... [NO] [OKAY] [OKAY] - -[NO]....... ....... stochastic_transformer [OKAY] [OKAY] - -sparse_attn ............ [NO] ....... [OKAY] -. [NO]stochastic_transformer ....... [OKAY]. - [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninjacpu_adam ................................. [OKAY][YES] - ...... --------------------------------------------------[OKAY] - -op name ................ installed .. compatible --------------------------------------------------- -fused_adam ............. [NO] ....... [OKAY] -cpu_adam ...............fused_lamb [YES]............. ......[NO] .......[OKAY] -[OKAY] -fused_adam ............. [NO] .......sparse_attn [OKAY]............ - [NO] .......fused_lamb [OKAY]............. - [NO] .......transformer [OKAY]............ - [NO] ....... [OKAY] -stochastic_transformer . sparse_attn[NO] ................... [NO][OKAY] -....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY] - -[OKAY]---------------------------------------------------------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------op nameop name - -................ op name ................ installedop name ................ installed ..................installed compatible..installed.. - compatiblecompatible--------------------------------------------------.. - - - ----------------------------------------------------------------------------------------------------compatible - - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam ......cpu_adam cpu_adam............... ...............[OKAY]...............[YES] - [YES][YES] ...... ............[OKAY] -[OKAY][OKAY]fused_adam - - ............. [NO] ....... [OKAY] -fused_adam fused_adam.............fused_lambfused_adam ............. .............[NO] ............. [NO] [NO] .............. .......[OKAY][NO] - [OKAY][OKAY]....... -fused_lamb - [OKAY].............fused_lamb - [NO]fused_lamb............. .......[NO] .............sparse_attn [OKAY] -.......[NO]............ [OKAY][NO]....... - .......[OKAY] [OKAY] - -sparse_attn ............ [NO] transformer....... ............[OKAY]sparse_attn - [NO]sparse_attn............ transformer ....... ............[NO] ............ [OKAY] -[NO].......[NO] .......stochastic_transformer.......[OKAY] - [OKAY].transformer[OKAY] -[NO] - ...................transformer stochastic_transformer ............[NO] [OKAY] -.[NO]....... [NO].......[OKAY] -.......[OKAY] -[OKAY]stochastic_transformer - stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -op nameop name op name ................op name ................ ................ installedinstalled................ installed..installed ....compatible -.. compatible-------------------------------------------------- compatible - - -compatible-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adamcpu_adam cpu_adam............... ..............................[YES] [YES][YES]...... ............... ......[OKAY]...... - [OKAY][OKAY] - - [YES] ...... [OKAY] -fused_adam .............fused_adamfused_adam .............[NO]............. .......[NO][NO] [OKAY]....... -....... [OKAY][OKAY] - -fused_lamb ............. [NO]fused_lambfused_lamb fused_adam ....... ....................................... [OKAY] -[NO][NO][NO] ....... .............. [OKAY][OKAY] - -[OKAY] -sparse_attn ............ fused_lamb[NO] .................... [NO][OKAY] -.......sparse_attnsparse_attn [OKAY]transformer -........................ ............[NO][NO] [NO].............. ....... [OKAY][OKAY][OKAY] - - -transformertransformer ............stochastic_transformer............ [NO][NO]. sparse_attn ....... ....... [NO][OKAY]............ [OKAY] - ....... -[NO] [OKAY]stochastic_transformer -stochastic_transformer . ........[NO] [NO][OKAY]....... .......[OKAY] -[OKAY] - -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. .................. [OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- - --------------------------------------------------- ---------------------------------------------------op name -op name - ................op name op name................ installed ................ ..installed................ compatible..installedinstalled - -------------------------------------------------- compatible.. -.. - --------------------------------------------------compatiblecompatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES]cpu_adam ..................... cpu_adam[YES][OKAY]cpu_adam -.................................... [OKAY][YES][YES] - ............ fused_adam[OKAY][OKAY] - -............. [NO] ....... fused_adam[OKAY] -............. [NO]fused_lamb fused_adam....................fused_adam [OKAY][NO].......................... - [NO].......fused_lamb [OKAY][NO]............. -....... [NO].......[OKAY] -[OKAY]....... -[OKAY]fused_lamb -sparse_attnfused_lamb ...................................... [NO][NO] [NO] ....... ....... .......sparse_attn[OKAY] -............[OKAY][OKAY] -[NO] - ....... [OKAY]transformer - ............ [NO] transformer....... sparse_attn............[OKAY] - sparse_attn............[NO] stochastic_transformer[NO]............ ....... .[NO].......[OKAY] -.......[NO][OKAY] -.......stochastic_transformer [OKAY] [OKAY] - -transformer. transformer............[NO] ............ [NO].......[NO] [OKAY].............. - [OKAY][OKAY] - -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop nameop name op name ................................ installed................ ................ installedinstalled .. installed.. ..compatiblecompatible.. - - compatible---------------------------------------------------------------------------------------------------- - - -compatible-------------------------------------------------- - --------------------------------------------------- -cpu_adamcpu_adam ..............................cpu_adam cpu_adam [YES] [YES] ............... ........................... [YES][OKAY][YES] [OKAY] - ............ - [OKAY][OKAY] - -fused_adam fused_adam.............fused_adam fused_adam[NO] ............. .......................... ....... [NO][NO][NO] [OKAY] .............. -....... [OKAY]fused_lamb[OKAY] - [OKAY] -............. - fused_lambfused_lamb[NO] .......fused_lamb.......................... [OKAY][NO].............[NO] - ..............[NO] [OKAY][OKAY]....... - - [OKAY] -sparse_attn ............ [NO]sparse_attn sparse_attn....... sparse_attn ............ [OKAY] ........................[NO] - [NO][NO]....... transformer ..........................[OKAY] -[NO][OKAY][OKAY] - -transformer....... [OKAY]transformertransformer............ - ........................[NO] [NO]stochastic_transformer .......[NO] ....... [OKAY]........ -[OKAY] -[NO][OKAY] -.......stochastic_transformer stochastic_transformer [OKAY] stochastic_transformer -. . [NO].[NO] .......[NO]....... [OKAY].......[OKAY] - -[OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. ..................[OKAY][OKAY][OKAY] - - ---------------------------------------------------[OKAY] ----------------------------------------------------------------------------------------------------- - - -op name--------------------------------------------------op name op name................ - ................op name................installed ..installed................installed compatible..installed.. - compatible..compatible-------------------------------------------------- - - -------------------------------------------------- - ---------------------------------------------------compatible - --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adam[YES]............... cpu_adam.....................[YES] ............... [YES] [OKAY] ............ -[YES] [OKAY][OKAY]...... - - [OKAY] -fused_adam ............. [NO] ....... [OKAY]fused_adam - fused_adam.............fused_adam fused_lamb.............[NO] ..........................[NO] ....... [NO].......[OKAY] [NO] - .......[OKAY] -fused_lamb[OKAY]....... - .............[OKAY] -fused_lamb[NO] .............fused_lamb....... [NO]............. [OKAY]sparse_attn....... - [OKAY][NO]............ - .......[NO] [OKAY]....... - [OKAY] -transformersparse_attn ........................ [NO]sparse_attn[NO] ....... .......sparse_attn ............ [OKAY][NO] -[OKAY]............ - .......[NO] transformer[OKAY] stochastic_transformer -................... [NO][OKAY]. - .......transformer [NO] [OKAY] transformer............ - ....... ............ [NO] [NO]stochastic_transformer.......[OKAY] -[OKAY]....... -. [OKAY][NO] - .......stochastic_transformer stochastic_transformer [OKAY] -.. [NO][NO] ....... .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inferencetransformer_inference .. [NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -......quantizer [OKAY].............. - [NO] ....... quantizer[OKAY] -.............. [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.async_io -............... [NO] ....... [NO] -transformer_inference .. [NO] .......async_io [OKAY]............... - [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inferencequantizer ................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... async_io[NO] -............... [NO] ....... [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY]quantizer - .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ......  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.[OKAY] - -quantizer .............. [NO] ....... [OKAY]async_io - ............... [NO] --------------------------------------------------....... - [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -utils .................. [YES] ...... [OKAY] -transformer_inference transformer_inference.. ..[NO] .......[NO] [OKAY]....... -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [OKAY] -utils ..................utils [YES].................. ...... [YES][OKAY] -...... [OKAY]quantizer - .............. [NO] quantizer....... [OKAY].............. - [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ...............async_io [NO] ...................... [NO][NO] - ....... [NO] -async_io ...............transformer_inference [NO].. .......[NO]transformer_inference [NO]....... -.. [OKAY] -[NO] ....... [OKAY] -utils transformer_inference.................. ..[YES] [NO]......utils [OKAY]....... - ..................[OKAY] -quantizer[YES] .................... [NO]utils[OKAY] .................. -....... [YES][OKAY] -quantizer...... [OKAY]--------------------------------------------------.............. - - [NO] ....... [OKAY]quantizer - .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] .......[OKAY] -[OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer ..............quantizer .............. [NO] ....... [NO][OKAY] -....... -------------------------------------------------- -[OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ......................... [OKAY][YES] --------------------------------------------------- - ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -------------------------------------------------------------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report-------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO]async_io ....... ...............[OKAY] -[NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -transformer_inferencequantizer ................ [NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... async_io[NO] - ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] - - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -[YES] ...... [OKAY]quantizer - .............. [NO] ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninja -JIT compiled ops requires ninja - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................................................................ installedinstalled installed installed .... .. compatible..compatible - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -compatible--------------------------------------------------compatible-------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------DeepSpeed C++/CUDA extension op report --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - - -JIT compiled ops requires ninja ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at -cpu_adamcpu_adam cpu_adam..............................cpu_adam [YES] ............... [YES]..................... [YES]......[OKAY][YES] -[OKAY]............ -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- - [OKAY][OKAY] - -JIT compiled ops requires ninja -fused_adam ............. [NO]fused_adam .......fused_adamfused_adam............. [OKAY][NO].......................... - .......fused_lamb [NO] [NO][OKAY]............. -..............[NO] fused_lamb.......[OKAY][OKAY] - -............. [OKAY][NO] -fused_lambfused_lamb ................................. [OKAY][NO][NO] - .............. [OKAY][OKAY]sparse_attn - - ............ [NO] ....... [OKAY]sparse_attn - ............ [NO] .......transformersparse_attn sparse_attn............ [OKAY] ............ -[NO]............ [NO].......[NO]transformer .......................... [OKAY] [OKAY][NO] -[OKAY] - -....... [OKAY]stochastic_transformer - transformertransformer . stochastic_transformer ........................ [NO] .[NO][NO]....... .......[NO][OKAY]....... [OKAY] -....... -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -[OKAY]stochastic_transformer -async_io ............... [NO] ....... [NO] - stochastic_transformer. [NO]. .......[NO] [OKAY]....... - [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja ...................................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name ---------------------------------------------------op name op name - ................ ................................installedop name installed..installed................ compatibleinstalled.... - ..--------------------------------------------------compatiblecompatible - - -compatible-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adam[YES]cpu_adam............... ......[YES].............................. ......[YES][YES] [OKAY]...... -...... [OKAY][OKAY] - - [OKAY] -fused_adam ............. [NO] fused_adamfused_adam....... .............[OKAY]............. - [NO][NO] .......fused_lamb ....... [OKAY] ............. -[OKAY] -[NO]fused_adam .......fused_lamb............. fused_lamb [OKAY] -.......................... [NO][NO][NO] .............. [OKAY][OKAY]....... - -[OKAY]sparse_attn - ............ [NO] ....... [OKAY] -fused_lambtransformer ............sparse_attnsparse_attn ..................................... [NO][NO][NO][NO] ..................... [OKAY] [OKAY][OKAY] - -transformer....... -transformer ........................ stochastic_transformer[NO] [NO][OKAY] . - ....... ....... [NO] [OKAY][OKAY]....... - - [OKAY] -stochastic_transformerstochastic_transformer .. sparse_attn [NO] [NO] .............. [OKAY]............[OKAY] - -[NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja --------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name op name op name................ ................ ................installed ................ installedinstalled ..installed .. .. compatible.. compatiblecompatible - - -----------------------------------------------------------------------------------------------------compatible --------------------------------------------------- - - --------------------------------------------------- -cpu_adam cpu_adam...............cpu_adam ...............cpu_adam...............[YES] [YES] [YES]........................... [OKAY]......[OKAY] [YES] - [OKAY] -...... - [OKAY] -fused_adam fused_adam............. fused_adam ............. [NO]fused_adam............. .......[NO][NO]............. .......[OKAY].......[NO] - [OKAY][OKAY] -fused_lamb....... - .............[OKAY]fused_lamb -[NO]fused_lamb............. fused_lamb....... [NO].............[OKAY]............. - .......[NO][NO] [OKAY].............. - [OKAY][OKAY] - -sparse_attn ............ [NO] ....... [OKAY]sparse_attn -sparse_attn sparse_attn........................transformer [NO]............[NO]............ ....... [NO] .......[NO][OKAY] [OKAY] -....... -....... [OKAY][OKAY]transformer - -transformer............ transformerstochastic_transformer[NO]............ ....................[NO] [OKAY] [NO][NO] -....... ..............[OKAY] -[OKAY]stochastic_transformer[OKAY] - -.stochastic_transformer [NO] stochastic_transformer....... . [OKAY] . -[NO] [NO]....... .......[OKAY] -[OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name -op name op name................ op name................ ................installedinstalled................ installed .... installedcompatiblecompatible.. - -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY] -[OKAY][OKAY] - - - --------------------------------------------------..-------------------------------------------------- compatiblecompatible - - - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - -cpu_adamcpu_adam ...............cpu_adam............... [YES]cpu_adam...............[YES] ...... ..................... [YES] [OKAY] [OKAY] -[YES]...... - ......[OKAY] -[OKAY] - op nameop name................op name ................................................ installed installed installed installed.... compatible....compatible - - compatible----------------------------------------------------------------------------------------------------compatible - - - --------------------------------------------------- --------------------------------------------------- -fused_adam ............. fused_adam[NO] fused_adam.................... fused_adam [NO][OKAY] -cpu_adamcpu_adam ...............cpu_adam...............cpu_adam [YES][YES].............................. ...... [YES] ......[YES][OKAY] -.................... .............[NO]fused_lamb[OKAY] -......[OKAY]...... - [OKAY][OKAY] - -[NO] ............. ....... .......fused_lamb[NO] [OKAY][OKAY] ............. - - .......[NO] [OKAY]....... -fused_adam ............. [NO]fused_adam .................... fused_adam [OKAY]fused_adam -fused_lambfused_lamb [OKAY].......................... - [NO][NO] .............. [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [NO] ............. .............fused_lamb ....... [NO] [NO]............. [OKAY] [NO]....... -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] -async_io ............... [NO] ....... [NO] - ....... ....... fused_lamb[OKAY][OKAY][OKAY] ............. - - -[NO] sparse_attn.......sparse_attntransformer ............[OKAY]........................ -transformer_inference .. [NO] ....... [OKAY] - [NO]fused_lamb ....... fused_lamb ............. [OKAY] ............. - [NO] [NO][NO] transformer ................................. [OKAY][OKAY][OKAY][NO] - - -[NO] [NO]....... sparse_attn....... ............[OKAY][OKAY] - -....... stochastic_transformertransformertransformer [OKAY]............ . -utils .................. [YES] ...... [OKAY] -[NO]sparse_attn ................... [OKAY][NO] - ....... [OKAY] - ............[NO][NO] [NO]stochastic_transformer.............. .......[OKAY].[OKAY] - -[OKAY][NO] -quantizer .............. [NO] ....... [OKAY] -transformer ............transformer [NO]sparse_attn sparse_attn................... [NO]............[OKAY] ............ - ....... stochastic_transformer[OKAY]stochastic_transformer - . .[NO] [NO]....... .......[OKAY] --------------------------------------------------- - ....... [NO] [NO] stochastic_transformer [OKAY]....... ....... - . [OKAY] [OKAY] -[NO]stochastic_transformer -[OKAY] - .......transformer .[OKAY] -transformer[NO]............ ...................[NO] [OKAY][NO]....... - .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ...................................................... .................. [OKAY] - [OKAY][OKAY][OKAY]-------------------------------------------------- - - - ---------------------------------------------------op name -----------------------------------------------------------------------------------------------------................op name - - op name................installedop name installed.................. ................ ..installed installedcompatible compatible -.. ---------------------------------------------------.. -------------------------------------------------- -compatible -compatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] cpu_adamcpu_adam ............... .....................[YES]............... [OKAY] ......[YES] -[YES] [OKAY]............ - [OKAY][OKAY] - -fused_adam ............. [NO] ....... fused_adam[OKAY] -fused_adam.............fused_adamfused_lamb [NO]....................................... .......[NO][NO][NO] [OKAY]....... - ..............[OKAY] [OKAY] -[OKAY]fused_lamb - - ............. [NO]fused_lambfused_lamb ....... ............. .............[OKAY][NO]sparse_attn - [NO]................... .......[NO][OKAY] .......[OKAY] - -[OKAY] -sparse_attn ............ transformer[NO] ................... [NO][OKAY] -sparse_attn.......sparse_attn ............[OKAY]transformer............ - ............[NO][NO] .......stochastic_transformer[NO] .......[OKAY]....... - [OKAY].[OKAY] - -transformer[NO] ....... stochastic_transformertransformer............[OKAY] -.............[NO] [NO][NO]....... ....... ....... [OKAY] [OKAY] -[OKAY] - -stochastic_transformer stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninja --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -JIT compiled ops requires ninja-------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] -[OKAY] - --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- -op name - --------------------------------------------------- - op name................op name op name................installed ................................ installed ..installedinstalled ....compatible.. -compatible compatible ---------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -cpu_adam ............... cpu_adam[YES]cpu_adam cpu_adam..................... ...............[YES]............... [OKAY] ...... -[YES] [YES] [OKAY] ...... -...... [OKAY][OKAY] ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -fused_adam ............. [NO] fused_adam....... .............[OKAY] -JIT compiled ops requires ninja -fused_adam[NO]fused_adam fused_lamb................................. .............[NO][NO][OKAY] -[NO].............. .......fused_lamb [OKAY][OKAY][OKAY] -............. - - [NO]fused_lamb fused_lamb ....... ............. ............. [OKAY] [NO] -[NO] ..............sparse_attn [OKAY][OKAY]............ - - [NO] ....... [OKAY]sparse_attn - ............ transformer[NO] ................... [NO][OKAY] -....... sparse_attnsparse_attn[OKAY]transformer -............ ............ ............ stochastic_transformer[NO][NO] [NO]........ [OKAY] -....... [NO] ....... [OKAY] .......transformer[OKAY] - -[OKAY]............ -transformer [NO]stochastic_transformer............ .......[NO] . [OKAY] ....... -[NO] [OKAY].......stochastic_transformer - [OKAY] -. stochastic_transformer[NO] ....... .[OKAY] -[NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op nameop name - op name................................op name installed ................installed .................. installed..compatible - ..installed--------------------------------------------------compatible - -..compatible-------------------------------------------------- -compatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES] ......cpu_adam cpu_adamcpu_adam ............................................. [YES][YES][YES] ............[OKAY]...... [OKAY][OKAY] - -[OKAY] - -fused_adamfused_adam .......................... fused_adam [NO] [NO].................... fused_adam[NO] ....... [OKAY].................... - [OKAY][OKAY] - -[NO]fused_lamb fused_lambfused_lamb....... ............. .......................... [OKAY] -[NO][NO][NO] fused_lamb ..................... [OKAY] -[OKAY][OKAY]............. - - [NO] ....... [OKAY] -sparse_attn sparse_attn............sparse_attn ............[NO]............ [NO][NO]....... ....... .......[OKAY] sparse_attn -[OKAY][OKAY] - - transformertransformer............ transformer ........................ [NO] ............[NO] [NO].......[NO] .............. [OKAY] [OKAY] -.......[OKAY] - -[OKAY]stochastic_transformer - transformerstochastic_transformer . stochastic_transformer............[NO]. .......[NO] .[NO]....... [NO][OKAY] ....... -....... [OKAY][OKAY][OKAY] - - -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja .................................... .................. [OKAY].................. [OKAY] - [OKAY] -[OKAY] --------------------------------------------------- - ------------------------------------------------------------------------------------------------------------------------------------------------------- -op name - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - op name................op nameop name installed................................ ................ installed..installed compatible..installed.. - compatible--------------------------------------------------..compatible - - -compatible---------------------------------------------------------------------------------------------------- - - --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name -op name -ninjaninja .................................... [OKAY][OKAY] - -cpu_adam ............... [YES] ......cpu_adam cpu_adam cpu_adam[OKAY].............................. ............... -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY] -[OKAY][OKAY] --------------------------------------------------- - - - ................op nameop name................ installedinstalled................................ ..installed..installed compatible .. -compatible.. - -------------------------------------------------- --------------------------------------------------compatible -compatible - - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - - [YES] [YES] [YES] ...... ............[OKAY] -[OKAY][OKAY] -fused_adam -------------------------------------------------------------------------------------------------------------------------------------------------------op name - - -cpu_adam cpu_adam............... cpu_adam...............[YES]cpu_adam [YES]...... [OKAY]..................... -............... [YES][OKAY][YES] - ............ [OKAY][OKAY] - -op nameop name ................................ installedinstalled .. ..compatible - ............. [NO] ....... [OKAY] -................op nameop nameop name ................installed................................ installed..installedinstalled .... compatible.. compatible - compatible -compatible-------------------------------------------------- - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -fused_adam ............. [NO] fused_adam....... .............[OKAY] -compatible-------------------------------------------------- - -fused_adam fused_adam.............fused_lambfused_adam ............. .............[NO] ............. [NO] .......[NO]....... [NO] [OKAY][OKAY].............. - - [OKAY][OKAY] - -cpu_adam cpu_adamcpu_adamcpu_adam............... .............................................[YES] [YES][YES]...... [YES] ...... ......[OKAY] -......[OKAY][OKAY] - -[OKAY] -[NO]fused_adamfused_adam ....... fused_lamb .............[OKAY]............. --------------------------------------------------- -fused_lamb .............fused_lamb fused_lamb [NO] ............. .................... sparse_attn[NO][NO] [OKAY] -.......................... [NO][OKAY][OKAY] - -....... [OKAY] -fused_adam ............. [NO] fused_adamfused_adam.......fused_adam .......................... [OKAY] .............[NO] - .............[NO][NO] [NO]fused_lamb.............. ....... .............[OKAY] -[OKAY][OKAY][NO] - - fused_lamb....... .............fused_lamb[OKAY] -cpu_adam ...............cpu_adam [YES] ..................... [YES][OKAY] -transformer ............ [NO]sparse_attn sparse_attn...................sparse_attn [NO][OKAY]........................ -[NO] [NO]....... .......fused_lamb....... [OKAY][OKAY]............. -[OKAY] - -[NO]fused_lamb .......fused_lamb............. fused_lamb [OKAY]............. - [NO]............. .......[NO] [OKAY]....... - sparse_attn[OKAY] ............ - ...... [OKAY] - .......[NO][NO] stochastic_transformer [OKAY] ....... - [NO] ............. [NO] .......[NO] .......[OKAY]....... - [OKAY][OKAY] -sparse_attn - [NO] ....... sparse_attn[OKAY] -fused_adam fused_adam............. [NO]............. .......[NO] [OKAY] -....... . [OKAY][OKAY][NO]transformer - - ............ [NO] ....... [OKAY] -............ [NO]transformer .......sparse_attn............ [OKAY] - ....... [OKAY] - ...................transformer transformer[NO][OKAY] ............ -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - -op name -sparse_attntransformer ............sparse_attn............sparse_attn [NO] [NO]............ ................... ....... [OKAY] [NO][NO][OKAY] - -sparse_attn[NO]............ transformer.......[NO] ............[OKAY] - ............ ....... [NO][NO][OKAY] stochastic_transformer -fused_lamb ............. fused_lamb[NO] .................... [OKAY] -................... [NO][NO][OKAY] - .............. [OKAY] -[OKAY]stochastic_transformer -op name op name op name ................................................ installedinstalled................ installed .. .. installed ..compatible compatible -.. ---------------------------------------------------compatible - ---------------------------------------------------compatible - - .............. [OKAY][OKAY] -transformerstochastic_transformer -.............. transformer [OKAY] [OKAY] -............. -[NO] ....... [OKAY] - .stochastic_transformerstochastic_transformer [NO] ......... [NO][OKAY][NO] --------------------------------------------------- --------------------------------------------------- - ............ .[NO]transformer transformer [NO]....... ........................ ....... [NO][OKAY][NO] - [OKAY].............. - [NO]stochastic_transformer [NO]transformer ....... ....................[OKAY] [NO] -[NO] [OKAY] ....... -sparse_attn sparse_attn............ ............[NO] .......[NO] [OKAY]....... - [OKAY] - .............. [OKAY][OKAY] - -cpu_adam ............... [YES] cpu_adam...... cpu_adam ...............cpu_adam [OKAY] - stochastic_transformer [OKAY][OKAY] - -. [NO] .......stochastic_transformer stochastic_transformer[OKAY] -....... [OKAY][OKAY] -stochastic_transformer -transformer ............ [NO]transformer ....... ............[OKAY] -[NO] ....... [OKAY] -...............[YES]............... [YES]......[YES] [OKAY]............ - [OKAY][OKAY]fused_adam - -.. [NO][NO] .............. [OKAY][OKAY] - - . [NO]stochastic_transformer ....... .[OKAY] -[NO] ....... [OKAY] -stochastic_transformer stochastic_transformer. [NO] ........ [NO][OKAY] - ............. [NO] ....... [OKAY] - ....... [OKAY] -fused_adam .............fused_adam fused_adamfused_lamb [NO] ............. ................................. [NO] [NO][OKAY][NO] -.............. .......[OKAY]fused_lamb[OKAY] - -.............[OKAY] -[NO]fused_lamb .......fused_lamb............. [OKAY].............[NO] - [NO]sparse_attn....... ...................[OKAY] -[NO][OKAY] -....... [OKAY]sparse_attn - ............ [NO] transformer....... ............sparse_attn [OKAY][NO] - ...................sparse_attn transformer[OKAY][NO] ............ -............ ....... [NO] [NO]stochastic_transformer [OKAY] .............. - .[OKAY] [OKAY] -[NO]transformer - ................... stochastic_transformer transformer [OKAY][NO] -. ............ .......[NO] [OKAY].......[NO] - [OKAY]....... -stochastic_transformer [OKAY] -. [NO] .......stochastic_transformer [OKAY] -. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`................ [NO] -....... [NO] -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] utils....... ..................[NO] [YES] - ...... [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [OKAY] - -quantizer .............. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] --------------------------------------------------- -async_io ............... [NO]utils ......................... [NO][YES] - ...... [OKAY] -quantizer .............. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] --------------------------------------------------- -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -ninja .................. [OKAY] ---------------------------------------------------cpu_adam - ...............op name [YES]................ installed...... .. [OKAY]compatible - --------------------------------------------------- -fused_adam .............cpu_adam [NO]............... .......[YES] ......[OKAY] -[OKAY] -fused_lamb ............. [NO] ....... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -sparse_attnfused_lamb ......................... [NO][NO] .............. [OKAY] -[OKAY] -transformer ............ [NO] ....... [OKAY] -sparse_attn ............ stochastic_transformer[NO] ....... .[OKAY] - [NO] .......transformer ............[OKAY] [NO] - ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -stochastic_transformer . [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ...................... [NO][NO] - ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils - .................. [YES]utils ........................ [OKAY][YES] - ...... quantizer[OKAY] -.............. [NO] quantizer....... ..............[OKAY] -[NO]async_io ....... ...............--------------------------------------------------[OKAY] - -[NO] ....... [NO]-------------------------------------------------- - -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - - -------------------------------------------------------------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -JIT compiled ops requires ninja------------------------------------------------------------------------------------------------------------------------------------------------------ - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - -ninjaninjaninjaninja .................................... [OKAY]..................[OKAY].................. - - [OKAY][OKAY]-------------------------------------------------- - --------------------------------------------------- - ---------------------------------------------------op name--------------------------------------------------op name - - op name................op name ................................ ................ installedinstalled installed installed.... compatible .... -compatible compatible-------------------------------------------------- -compatible - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... cpu_adam[YES]cpu_adamcpu_adam ...... ............... ............... ...............[OKAY][YES] [YES] -......[YES] ...... [OKAY] ...... -[OKAY] -[OKAY] -fused_adam ............. [NO] ....... [OKAY]fused_adam -fused_adam fused_adam fused_lamb ............. ....................................... [NO][NO][NO][NO] ....... .............. ....... [OKAY] [OKAY][OKAY] -[OKAY] - - -fused_lambfused_lambfused_lamb ....................................... [NO][NO][NO]sparse_attn .......................... ....... [NO][OKAY][OKAY] - -.......[OKAY] -[OKAY] -transformer ............ [NO] .......sparse_attn sparse_attn[OKAY] -............sparse_attn............ [NO]............[NO]stochastic_transformer ..............[NO] . [OKAY] [NO][OKAY] -....... -....... transformer [OKAY][OKAY]transformer - -........................transformer [NO][NO]............ .......[NO]....... [OKAY]....... -[OKAY] -[OKAY] -stochastic_transformer stochastic_transformer stochastic_transformer. . [NO].[NO] .......[NO]....... [OKAY][OKAY]....... - - [OKAY] ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------JIT compiled ops requires ninja - --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja ---------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] - -[OKAY]-------------------------------------------------- --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- -op nameop name - op name................ op name ................ ................installed ................installed installed ..installed .... compatible .. -compatiblecompatible -------------------------------------------------- -compatible-------------------------------------------------- - - - --------------------------------------------------- --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adamcpu_adam [YES] ................................................... [YES][YES][YES][OKAY] -.................. [OKAY][OKAY][OKAY] - - -fused_adam ............. [NO]fused_adamfused_adamfused_adam .............................................. [OKAY] [NO][NO] - [NO] ....... ....... ....... [OKAY]fused_lamb[OKAY] - -[OKAY].............fused_lamb - fused_lamb [NO] fused_lamb ............. .................... ............. [NO][OKAY] [NO] -.......[NO] ..............[OKAY] -[OKAY][OKAY] - -sparse_attn ............ sparse_attnsparse_attn[NO]sparse_attn ............ ............................... [OKAY] [NO] -[NO][NO] ..................... transformer [OKAY] [OKAY][OKAY] -............ - -[NO] transformer.......transformertransformer ............ ............[OKAY] ............ - [NO][NO][NO] ..................... stochastic_transformer [OKAY][OKAY] [OKAY] - - -. [NO]stochastic_transformer stochastic_transformer stochastic_transformer....... .[OKAY]. . - [NO] [NO] .......[NO] .......[OKAY]....... - [OKAY][OKAY] - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name op nameop name ................ ................................installed................ installed..installedinstalled ..compatible.... - compatiblecompatible-------------------------------------------------- -compatible - - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -cpu_adam ...............cpu_adam cpu_adam[YES]cpu_adam .................................... ............... [YES] [OKAY] [YES][YES] ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -...... ............[OKAY] [OKAY] -[OKAY] - - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja - -fused_adam ............. [NO] .......fused_adamfused_adam fused_adam[OKAY].......................... --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... ....................................[OKAY] [OKAY] - [NO][NO]............. fused_lamb....... [NO][OKAY]....... -............. ....... [OKAY] -[NO]fused_lamb[OKAY] ....... - -[OKAY][OKAY]-------------------------------------------------- - --------------------------------------------------- -.............fused_lamb [OKAY] [NO] -.............fused_lamb .......[NO]............. .......[OKAY][NO] - -----------------------------------------------------------------------------------------------------op name -op name - [OKAY]....... - [OKAY] - ................op nameop name................ installed................ ................ installed installed installed .. .. .... compatiblecompatiblecompatible - -compatible -sparse_attn ............ sparse_attn[NO] sparse_attn....... ............sparse_attn ............ [OKAY] [NO] ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - - --------------------------------------------------- - [NO]................... transformer[OKAY][NO] ....... -............ [OKAY] .......[NO]transformer - [OKAY] ....... -cpu_adamcpu_adam cpu_adam............... [YES] ...... [OKAY] -............transformer [OKAY][NO]............ -transformer .......[NO]............ stochastic_transformer.......[OKAY] [NO] -cpu_adam............... ...............[YES]............... fused_adam......[YES] [OKAY]...... - [OKAY]........ - stochastic_transformer[NO][OKAY] -........ stochastic_transformer [OKAY] [NO] - ............. [YES] [OKAY] [NO]fused_adam...... - ........stochastic_transformer [NO][OKAY] -.................... [OKAY] [NO] - [OKAY]....... -........ [OKAY][NO] - ....... [OKAY] -fused_adam fused_lamb[OKAY] -.......................... [NO][NO] fused_lamb.............. .............[OKAY][OKAY]fused_adam -[NO] - .................... [OKAY] -fused_lamb [NO]............. sparse_attn.......[NO] ................... sparse_attn [OKAY] [NO] -............ [OKAY].......[NO] - .......[OKAY] -[OKAY] -sparse_attnfused_lambtransformertransformer ............ ..................................... [NO][NO][NO] .......[NO] ....... ....... [OKAY] [OKAY] -[OKAY] - -....... [OKAY]stochastic_transformerstochastic_transformer - transformer . ............. [NO] [NO] [NO] ..................... [OKAY] -[OKAY][OKAY] - -stochastic_transformer .sparse_attn [NO] ....... [OKAY] -............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop name op name................op name ................................ ................installed installedinstalledinstalled.. .... ..compatiblecompatible - -compatible--------------------------------------------------compatible-------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam cpu_adam...............cpu_adam ............... [YES] ............... ............... ......[YES] [YES] [YES] [OKAY]...... ...... - ......[OKAY][OKAY] - -[OKAY] -fused_adam .............fused_adam fused_adam[NO] ............. ............. ....... [NO]fused_adam[OKAY][NO] - ........................... fused_lamb [OKAY] -[OKAY].............[NO] - fused_lamb[NO].......fused_lamb ....................[OKAY]............. -[OKAY][NO][NO] - .............. [OKAY][OKAY]fused_lamb - - ............. [NO] ....... [OKAY]sparse_attn - ............ [NO]sparse_attn sparse_attn ....... ........................[OKAY] -[NO][NO] .......transformer....... [OKAY]............ [NO]sparse_attn[OKAY] - -................... transformer[OKAY]transformer[NO] - ............ [NO]................... stochastic_transformer .......[NO][OKAY] .[OKAY] -[NO] -....... ....... stochastic_transformertransformer [OKAY] [OKAY] - ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - ............. stochastic_transformer [NO] [NO] ............... [NO][OKAY][OKAY] - -....... [OKAY] - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -stochastic_transformer . [NO] ....... [OKAY] -JIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja-------------------------------------------------- -JIT compiled ops requires ninja - - ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ------------------------------------------------------------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -DeepSpeed C++/CUDA extension op report------------------------------------------------------------------------------------------------------------------------------------------------------ - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja---------------------------------------------------------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninjaJIT compiled ops requires ninja - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - - - -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -JIT compiled ops requires ninja - ----------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -JIT compiled ops requires ninja ------------------------------------------------------------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninja -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. ..................[OKAY] [OKAY] -[OKAY][OKAY] - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - --------------------------------------------------- -JIT compiled ops requires ninja -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name - -JIT compiled ops requires ninja -op name op name................op name................ ................installed................installed installed....installed ..compatible..compatible - -compatiblecompatible-------------------------------------------------- - --------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... cpu_adam...............cpu_adam [YES]............... [YES].....................[YES] ......[YES][OKAY]...... - [OKAY]......[OKAY] - -[OKAY] -fused_adam fused_adam............. fused_adam[NO]............. fused_adam.............[NO]....... [NO]....................[OKAY] -.......[OKAY][NO] -[OKAY]....... -fused_lamb [OKAY]............. -fused_lamb [NO]fused_lamb............. ....................[NO] fused_lamb [OKAY] [NO] -............. ....... ....... [NO] [OKAY] [OKAY] -....... - [OKAY] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - -sparse_attn ............ [NO]sparse_attn ................... sparse_attn[OKAY][NO] - meet the required dependencies to JIT install the op.--------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - - -JIT compiled ops requires ninja--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -DeepSpeed C++/CUDA extension op report ----------------------------------------------------------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------JIT compiled ops requires ninja - - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - ................... sparse_attn [NO] transformer [OKAY]............ - ...................[NO] [OKAY][NO]....... - transformer.......[OKAY] -............[OKAY]transformer - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - [NO]transformer............ ...................[NO] stochastic_transformer [OKAY][NO] ....... - .......[OKAY]. - [OKAY] - meet the required dependencies to JIT install the op. - - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja -[NO]stochastic_transformer ....... stochastic_transformerstochastic_transformer[OKAY]. - - [NO]. . ....... [NO] [NO] [OKAY] ....... -....... [OKAY][OKAY] - --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................. .................................... [OKAY] [OKAY] - [OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op nameop name - - ................................op nameop name installedinstalled ................................ .. ..installed installed compatiblecompatible.... - - ----------------------------------------------------------------------------------------------------compatiblecompatible - - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adamcpu_adam [YES] ...............cpu_adam ............... ...... [YES]............... [YES] [OKAY]............ - [YES][OKAY] [OKAY] -...... - [OKAY] -fused_adam ............. [NO] ....... fused_adam[OKAY]fused_adam -.......................... fused_adam[NO][NO] .............fused_lamb.............. .............[OKAY][NO] [OKAY] - -[NO] fused_lamb.............. fused_lamb ............. [OKAY][OKAY] - -[NO]............. [NO]....... fused_lamb.......[OKAY] - .............[OKAY] -sparse_attn [NO]............ .......[NO] [OKAY]....... - [OKAY] -sparse_attntransformer sparse_attn ........................ ............[NO][NO] [NO].............. .......[OKAY]sparse_attn [OKAY] - [OKAY] -............ - transformer[NO]transformerstochastic_transformer ............................... . [NO][OKAY] - [NO][NO] transformer....... ....... ................... [OKAY][OKAY] - - [NO][OKAY] -stochastic_transformer....... [OKAY] -.stochastic_transformer [NO] ........ [NO][OKAY] stochastic_transformer -....... [OKAY]. - [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop name ................op name................................ installed ................installed.. installed ..installed compatible .... - compatible --------------------------------------------------compatible -compatible - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adam ............... [YES]cpu_adamcpu_adam .....................cpu_adam ..............................[OKAY][YES] -[YES][YES]...... ............ [OKAY] [OKAY] -[OKAY] - -fused_adam ............. [NO] ....... [OKAY] -fused_adamfused_adam fused_lambfused_adam ............. ....................................... [NO][NO] ....... .......[NO][NO] [OKAY][OKAY].............. - - fused_lamb[OKAY][OKAY] -fused_lamb -............. fused_lamb.............[NO] [NO].................... .......[NO][OKAY] sparse_attn[OKAY] - ....... -............ [OKAY][NO] - ....... [OKAY] -sparse_attnsparse_attn ............sparse_attn............ transformer............ [NO][NO]............[NO] ....... ..............[OKAY][NO] -[OKAY] [OKAY]....... - - transformer[OKAY]transformer -transformer........................ stochastic_transformer ............ [NO][NO] . .......[NO]....... [NO].......[OKAY][OKAY] - - .......[OKAY] stochastic_transformer -[OKAY]stochastic_transformer - stochastic_transformer. .[NO]. [NO].......[NO] .......[OKAY]....... - [OKAY][OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja -JIT compiled ops requires ninja - --------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - -ninjaninjaninjaninja ...................................................... [OKAY]..................[OKAY][OKAY] - - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - ---------------------------------------------------[OKAY]-------------------------------------------------- - - ---------------------------------------------------op name ------------------------------------------------------------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - ---------------------------------------------------op name - --------------------------------------------------op nameop name................ -op nameop name op name................ ................ ................................installed installed installedinstalled.... ....compatiblecompatible - -compatible-------------------------------------------------- -------------------------------------------------- - -compatible --------------------------------------------------- - --------------------------------------------------- - installed................................op name ..installed................installed .. compatibleinstalled -.. compatible .. -compatible ---------------------------------------------------------------------------------------------------- - -compatible -op name op name ................op name................ installed................installed................ .... installed installed compatiblecompatible.. - - ..----------------------------------------------------------------------------------------------------compatible - -compatible - -cpu_adam ...............cpu_adam cpu_adam[YES] cpu_adam ............................................. ...... [YES][YES][YES] [OKAY]...... ......[OKAY]...... - - [OKAY][OKAY] - --------------------------------------------------- - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam cpu_adam[YES] cpu_adam .............................. ..................... [YES][OKAY][YES] -cpu_adamcpu_adam .............................. cpu_adamcpu_adam [YES] [YES] .................................... ...... [OKAY][YES][YES] - [OKAY] -fused_adam .............fused_adam [NO].............fused_adamfused_adam [NO]................................. ....... [OKAY] [NO][NO] - ............ [YES] [OKAY] - [OKAY]......fused_adam -............ [OKAY][OKAY] - -[OKAY] - ..............fused_lambfused_lamb [OKAY]............. - [OKAY].............[NO] - [OKAY]............. - [NO] ....... fused_adam[OKAY] -fused_adam fused_adam............. .............[NO] [NO].......fused_adam fused_adam .......[OKAY] ............. -............. [OKAY] [NO] - fused_lamb [NO]fused_lamb.................... ....... [NO]............. [OKAY][OKAY] -....... -fused_adam............. fused_adamfused_lamb............. .............[NO] [NO][NO] ............. ..................... [OKAY][OKAY][NO][OKAY] - - -fused_lamb[NO] fused_lamb........................... .............[OKAY][NO] -[NO] [OKAY]....... - [OKAY] -fused_lamb....... [OKAY]fused_lamb............. -[OKAY][NO]....... fused_lamb -[OKAY] ....... - .............[OKAY] fused_lamb - [NO].............sparse_attnfused_lamb [NO]................................ [OKAY][NO]....... -[NO] .................... [NO][OKAY] -....... [OKAY] -sparse_attn sparse_attn............ ............[NO]sparse_attnsparse_attn [NO]............................... ....... [NO][NO][OKAY] [OKAY] ....... -....... -[NO] [OKAY] - .............. [OKAY][OKAY] - -sparse_attn ............ [NO] sparse_attn....... ............[OKAY] - transformertransformer[OKAY] -[OKAY] -transformer sparse_attn............ sparse_attn ............ [NO] ............[NO]....... [NO]....... .......[OKAY][OKAY]sparse_attn - -............[OKAY] stochastic_transformer -[NO] sparse_attn....... transformer............sparse_attn[OKAY] - ............[NO]............transformer [NO] .......[NO] ............ .......[OKAY].......[NO] -........................transformer [NO]transformer [NO]............ ....... .......[OKAY]............ -[NO] [OKAY] -[NO] transformer....... .transformer ........................[OKAY][NO] ....... - [OKAY]....... -[OKAY] transformer[OKAY] - -.......stochastic_transformer[NO] [OKAY] stochastic_transformer -[NO][NO] transformer ....... [OKAY]....... - [OKAY][OKAY]............ - -stochastic_transformer............ transformer .stochastic_transformer[NO] [NO] ............ ........ ....... [NO] [NO][OKAY] -........ .stochastic_transformer[NO] [OKAY] ........[NO] - [OKAY][NO]....... - stochastic_transformer.......[OKAY] -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY][OKAY][OKAY] - [NO] .......stochastic_transformer stochastic_transformer [OKAY] - [OKAY].............. - stochastic_transformer[OKAY][OKAY] - - .[OKAY] -[NO] ....... [OKAY] - - - ----------------------------------------------------------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name -op name -.. [NO][NO] stochastic_transformer....... .......[OKAY] -[OKAY] -. [NO] stochastic_transformer....... [OKAY]. - [NO] ....... [OKAY] -op name................ op name ................................installed .. ................ installed installedcompatible installed -.. ..-------------------------------------------------- compatible.. -compatible - -. [NO] ....... [OKAY] ---------------------------------------------------compatible-------------------------------------------------- - - --------------------------------------------------- -cpu_adam cpu_adamcpu_adam cpu_adam ............................................. ...............[YES][YES][YES] [YES].................. ......[OKAY][OKAY] - -[OKAY][OKAY] - -fused_adam fused_adam............. fused_adamfused_adam.............[NO] ....... .............[NO]............. [OKAY] ....... [NO] - [NO] [OKAY] fused_lamb -.............. .............[OKAY]fused_lamb[OKAY] - -[NO]............. fused_lamb.......[NO]fused_lamb .............[OKAY].................... - [NO] [OKAY][NO] - .............. [OKAY] -[OKAY] -sparse_attn ............ [NO]sparse_attn ................... [OKAY][NO]sparse_attn - ................... sparse_attn transformer[NO] [OKAY]........................ - .......[NO]transformer [NO] .......[OKAY]............ - [OKAY].......[NO] - transformer....... stochastic_transformer[OKAY]............[OKAY] - - .[NO] [NO].......stochastic_transformer transformer....... [OKAY] -[OKAY]............. - stochastic_transformer[NO][NO] ............... [OKAY] -[NO][OKAY] -....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY] -[OKAY][OKAY] --------------------------------------------------- - - -----------------------------------------------------------------------------------------------------op name --------------------------------------------------- -op name -................op name op name ................ installed ................ ................ .. installedinstalled installed ..compatible.. -..compatiblecompatible-------------------------------------------------- - - -compatible-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- -cpu_adam ............... [YES]cpu_adam cpu_adam ..................... cpu_adam[OKAY]............... --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -[YES] ...............[YES]...... ......[OKAY] - [YES]fused_adam[OKAY] --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - -----------------------------------------------------------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -................... [OKAY][NO] - ....... [OKAY]fused_adam --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - fused_adam.............fused_lamb [NO].......................... .......[NO][NO]fused_adam ....................[OKAY]....... [OKAY] -[OKAY] - - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -[NO] .......fused_lambfused_lamb [OKAY].......................... [NO][NO] -.............. [OKAY][OKAY]fused_lamb - -sparse_attn ......................... [NO][NO] .............. [OKAY][OKAY] - -transformersparse_attnsparse_attn ........................ ............ [NO] [NO] [NO] ....... ....... .......[OKAY] -sparse_attn[OKAY][OKAY] - -ninjaninjaninjaninja .................................... ..................[OKAY].................. [OKAY] - [OKAY] ---------------------------------------------------[OKAY]-------------------------------------------------- - - - ---------------------------------------------------op name--------------------------------------------------op name - -stochastic_transformer ............transformer.transformer [NO] [NO]............ .......................... [OKAY][NO][OKAY][NO] - ....... -....... [OKAY][OKAY] - -transformer stochastic_transformer............stochastic_transformer . . [NO] [NO] [NO] ....... ....... ....... [OKAY] [OKAY] - ................................op name op nameinstalled ................................installed.. installed.. installed compatible compatible -[OKAY] - -.. -..-------------------------------------------------- -------------------------------------------------- - compatible -compatible -stochastic_transformer . [NO] ....... [OKAY] - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] ..................... [YES][OKAY] cpu_adamcpu_adam -...... ..............................[OKAY] - [YES][YES] ............fused_adam [OKAY][OKAY]............. - - [NO] .......fused_adam [OKAY]............. - [NO] .......fused_lamb [OKAY]............. -fused_adam fused_adamfused_lamb [NO] .......................... ............. ....... [NO] [NO] [OKAY][NO] - ....... ..............[OKAY] -[OKAY][OKAY] - -fused_lambfused_lamb .............sparse_attn............. [NO]............[NO] sparse_attn.......[NO]....... ............ .......[OKAY] [OKAY] - -[NO][OKAY] -....... [OKAY] -transformer ............transformer [NO]............ .......[NO] [OKAY]....... -sparse_attnsparse_attn [OKAY] -ninjaninjaninjaninja .................. .................. .................................... [OKAY] -........................stochastic_transformer [NO]stochastic_transformer.[NO] ............... [NO] [OKAY] [OKAY][NO]....... -[OKAY][OKAY][OKAY]-------------------------------------------------- - - - --------------------------------------------------- ---------------------------------------------------op nameop name-------------------------------------------------- - - - .......[OKAY]transformertransformer - [OKAY]............ -op name................................ op name................installedinstalled .................. ..installed installed compatible ..compatible.. - - --------------------------------------------------compatiblecompatible-------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -............ [NO][NO] .............. [OKAY][OKAY] - -cpu_adam ............... cpu_adam[YES] ...............cpu_adam...... cpu_adam [YES] [OKAY] ............... -..................... [OKAY][YES][YES] -stochastic_transformer stochastic_transformer . .[NO] [NO]....... .......[OKAY] -[OKAY] - ............ [OKAY][OKAY] -fused_adam - ............. [NO]fused_adam .................... [OKAY][NO] -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY] -[OKAY] - fused_adam.......fused_adam fused_lamb[OKAY] ............. - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name-------------------------------------------------- -............. ............. [NO] fused_lamb[NO][NO] ............. ....... ....... .......[NO][OKAY][OKAY] -[OKAY] -....... - fused_lamb[OKAY] - -fused_lamb............. .............[NO] [NO]....... .......[OKAY] sparse_attn[OKAY] - - op nameop name................op name ................................................installed installed..installedinstalled compatible.. .. -............ [NO] .......sparse_attn [OKAY]............ - ..compatible - --------------------------------------------------compatible-------------------------------------------------- -compatible - - --------------------------------------------------- --------------------------------------------------- - [NO] .......transformer [OKAY]sparse_attn -cpu_adam cpu_adamcpu_adam...............cpu_adam ............... ............... ............... [YES] [YES][YES] [YES]...... ...... ............ [OKAY] [OKAY] - [OKAY] -[OKAY] -............sparse_attn transformer ............[NO] ............ ................... [NO] [NO] [NO][OKAY] ....... - -.............. [OKAY][OKAY][OKAY]stochastic_transformer - - -fused_adam fused_adam............. fused_adamfused_adam............. [NO].............[NO] ............. .............. [NO] [NO][OKAY] -....... [OKAY][OKAY]....... -.stochastic_transformer transformertransformer [NO] . ............................... [OKAY][NO][NO] -fused_lamb - [OKAY]fused_lamb -[NO] ....... ....... ....... [OKAY] [OKAY] -[OKAY] -............. fused_lamb ............. fused_lamb[NO].............[NO] ....................[NO]....... [OKAY].......[NO] - -ninjaninjaninjaninja .................. .................................... ..................[OKAY][OKAY] [OKAY] -[OKAY] -.......[OKAY] -[OKAY] -stochastic_transformer stochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - -[OKAY] --------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- ---------------------------------------------------op name - -op name op name................op name installed................................ ................ .. installedinstalledcompatibleinstalled -sparse_attn ............ [NO]sparse_attn ....... sparse_attn[OKAY]............ -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY] [OKAY] - --------------------------------------------------.. .. -..compatible compatible -compatible - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -cpu_adam - ............[NO] transformersparse_attn.......[NO] ............ ............ [OKAY] .......[NO][NO] - .............. [OKAY]transformer [OKAY][OKAY] - -[OKAY] --------------------------------------------------- --------------------------------------------------- ----------------------------------------------------------------------------------------------------- -op name - ............... [YES] ...... [OKAY]cpu_adam - -............ - - ...............cpu_adamcpu_adam [YES] ............... ............... ...... [YES][YES]fused_adam [OKAY] ................... - ......[NO][OKAY] -[OKAY]....... - transformerstochastic_transformer[NO]transformer ................................ [OKAY][NO][NO][NO] ....... - op nameop name................ op name ................ ................installed ................ installedinstalled .. installed.. .. compatible -compatible..compatible-------------------------------------------------- - - ----------------------------------------------------------------------------------------------------- -compatible - - [OKAY] -.............. [OKAY][OKAY][OKAY] - -stochastic_transformer --------------------------------------------------- -fused_adam .............fused_lamb [NO]............. fused_adam.......fused_adam [NO].............[OKAY] - .stochastic_transformer [NO] .stochastic_transformer....... [NO][OKAY] ........ -cpu_adam cpu_adam............... cpu_adam............... [YES]cpu_adam...............[YES] [YES] ...... ..................... ...... [OKAY] [YES][OKAY] - ....................[NO] fused_lamb [NO] [OKAY]............. ....... -.......[NO] [OKAY].......[OKAY] - - [OKAY] -[NO] ....... [OKAY] -[OKAY] - -...... [OKAY] -[OKAY] -fused_adam fused_adam.............fused_adam [NO]fused_adam............. ............. ....... ............. [NO][NO] [OKAY]....... [NO] -fused_lambfused_lamb .......................... sparse_attn [NO] [NO]............ ..............[NO] [OKAY]sparse_attn[OKAY]....... - - ....... [OKAY] .......[OKAY] - fused_lamb -[OKAY] - ............ [OKAY][NO] - ....... [OKAY] -.............fused_lamb fused_lamb.............[NO] fused_lamb ............. ....................[NO][NO] [NO] [OKAY] ..................... - [OKAY][OKAY][OKAY] - - -transformer ............ transformer[NO] ................... sparse_attn [NO][OKAY] .......sparse_attn - ............ ............ [OKAY] -sparse_attn ............ [NO]sparse_attnsparse_attn sparse_attn ........................................... [NO][OKAY][NO][NO] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - -[NO]stochastic_transformer[NO] stochastic_transformer ....... ........ [OKAY][NO]. - .......[OKAY][NO] -[OKAY] transformer....... - ..................... transformer [OKAY][OKAY] - -[OKAY]............ - transformer[OKAY]............ - transformer[NO] transformertransformer................... ............[NO] ............[NO][OKAY]....... -....... [NO] [OKAY] [OKAY] ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -op name - op nameop name................op name ................ ................................ installed installedinstalled installed ...... compatiblecompatible..compatible - - -----------------------------------------------------------------------------------------------------compatible-------------------------------------------------- - - - - ............[NO] [NO]....... .......[OKAY] -[OKAY] -stochastic_transformer....... - [OKAY]stochastic_transformer --------------------------------------------------- -stochastic_transformer stochastic_transformer. [NO]. .......[NO] [OKAY]....... -.stochastic_transformer [NO].stochastic_transformer . .......[NO] .[NO] ....... [OKAY] [NO]....... -[OKAY] -.......[OKAY] -cpu_adamcpu_adam cpu_adam ............... cpu_adam.............................. [YES][YES]...............[YES] ............[YES] [OKAY] ......[OKAY] - ...... -[OKAY] -[OKAY] - [OKAY] -[OKAY] -fused_adamfused_adam .......................... fused_adamfused_adam [NO][NO] ............. ....... ....... [NO]............. [OKAY] -[OKAY][NO]....... [OKAY] -fused_lamb - .................... fused_lambfused_lamb[OKAY] [NO] - ............. .............[NO]fused_lamb....... [NO][OKAY]....... - ............. [OKAY] ....... -[NO] [OKAY]....... - [OKAY]sparse_attn - ............ sparse_attn[NO] ................... [OKAY][NO] - sparse_attn....... transformer............[OKAY] -............[NO] sparse_attn [NO].......transformer ....... [OKAY] ........................ [NO][OKAY] - -[NO]transformer ..............stochastic_transformer ............ [OKAY][NO] -.[OKAY] -.......[NO]stochastic_transformer transformer[OKAY] ....... - .............[OKAY] -stochastic_transformer[NO] .......[NO] . [OKAY] .......[NO] - ....... [OKAY] -[OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- - -op name -op name op name op name ................ ................................ ................ installedinstalledinstalled installed .. .. .... compatiblecompatible - -compatiblecompatible---------------------------------------------------------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... [YES]cpu_adam cpu_adam ......cpu_adam............... [YES]..............................[OKAY] - ......[YES][YES] [OKAY] -............ [OKAY][OKAY] - -fused_adamfused_adam ............. [NO]fused_adamfused_adam............. ....... ............. .............[OKAY] [NO][NO] - .......[NO]....... fused_lamb[OKAY]....... -[OKAY] -[OKAY] fused_lamb -............. fused_lamb .............[NO]fused_lamb .................................[NO] [OKAY]....... -[NO][NO] [OKAY].............. - [OKAY] -[OKAY] -sparse_attn ............ [NO] .......sparse_attn [OKAY] -sparse_attn............ transformer[NO] ............sparse_attn............ [NO][NO]....... ............ .............. [OKAY] [NO] -[OKAY] [OKAY] -transformer....... - stochastic_transformer[OKAY]............ - transformer[NO]. transformer...................[NO] ............[NO] [OKAY]....... -[NO].......[OKAY] -stochastic_transformer[OKAY] - ....... [OKAY]. -stochastic_transformer [NO] ........ [OKAY][NO] - .......stochastic_transformer [OKAY] -. [NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- - ----------------------------------------------------------------------------------------------------- -op nameop name -op name op name................................................ installed................installed installed .. installed..compatible .. - ..compatible -------------------------------------------------- - compatible ---------------------------------------------------compatible - - ----------------------------------------------------------------------------------------------------- - -cpu_adam ............... cpu_adam[YES] cpu_adamcpu_adam..................... ............... ...............[OKAY][YES][YES] - ......[YES]...... [OKAY]......[OKAY] - -[OKAY] -fused_adam ............. [NO] .......fused_adam [OKAY]fused_adam............. -fused_adam .............[NO]............. fused_lamb....... [NO] [NO] .............[OKAY]....... - [NO].......[OKAY] fused_lamb....... - .............[OKAY][OKAY] -fused_lamb - [NO]fused_lamb ............. ....... ............. [NO] [OKAY] [NO] -....... .......[OKAY] -[OKAY]sparse_attn -............ [NO] ....... [OKAY]sparse_attn - sparse_attn............transformer [NO]........................sparse_attn ....... [NO] [NO]....... ............ [OKAY][OKAY][NO] - -....... .......transformer[OKAY] stochastic_transformer -[OKAY]............ -transformer .[NO] ............transformer[NO]....... .......[NO] [OKAY]............[OKAY] ....... - - [NO][OKAY]stochastic_transformer - ....... .stochastic_transformer [OKAY] [NO] - ........ [NO]stochastic_transformer[OKAY] -....... .[OKAY] -[NO] ....... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] - - -ninjaninjaninjaninja .................. .................................... .................. [OKAY][OKAY] -[OKAY][OKAY] -----------------------------------------------------------------------------------------------------[OKAY]-------------------------------------------------- - - - -op nameop name -------------------------------------------------- op name................ - --------------------------------------------------- --------------------------------------------------- ---------------------------------------------------op name --------------------------------------------------- -................op name - ................ ................ op nameinstalledinstalledinstalled ...................... compatiblecompatiblecompatible - -installed ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -.. compatible --------------------------------------------------- - op name installedop name ................ ................ ................ ..installedinstalledinstalled compatible ...... - --------------------------------------------------compatiblecompatible -compatible - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -cpu_adamcpu_adamcpu_adam ............................................. cpu_adam[YES] [YES] [YES] ...... ...... ..................... [OKAY] [OKAY] -[OKAY] -[YES] - ...... [OKAY] -cpu_adam ............... cpu_adam[YES]cpu_adam cpu_adam ...... ..............................[OKAY]............... -[YES] [YES][YES] .................. [OKAY][OKAY][OKAY] - - -fused_adam fused_adam............. fused_adam.............[NO] fused_adam .............[NO]....... [OKAY][NO].................... -fused_adam ............. [NO] ....... [OKAY] - .......[OKAY] [NO] -[OKAY]fused_lamb -fused_adamfused_adamfused_lamb fused_adam ............. ............. .......................... [NO][NO] ....... [NO] [NO].......[OKAY] - .......fused_lamb.............fused_lamb .............[NO] .............[OKAY] [NO] -[NO] ....... ....... ....... [OKAY] [OKAY] -[OKAY] -fused_lamb - .......[OKAY]....... -fused_lamb[OKAY] -[OKAY]............. - ............. [NO] ....... [OKAY] - [NO]fused_lamb ....... fused_lamb............. [OKAY] ............. -[NO] sparse_attn[NO] .......................... [OKAY][NO][OKAY] -sparse_attn ............sparse_attn [NO]............sparse_attn .......[NO]............ [OKAY].......[NO] - -.......sparse_attn [OKAY]............ - [OKAY]....... - sparse_attn[OKAY] transformer - [NO] ....... transformer[OKAY] -............sparse_attn [NO] transformer sparse_attn ................... ............[OKAY] ............ [NO] -transformer............ ............ ............transformer [NO] [NO] [NO]............ .............. ....... [NO] [OKAY][OKAY] [OKAY] - [NO] [NO] ....... stochastic_transformer....... ....... [OKAY] [OKAY][OKAY] -. - - -....... - transformer[OKAY] - [NO] .......transformer stochastic_transformertransformer [OKAY]............ - stochastic_transformerstochastic_transformer............ stochastic_transformer.. [NO] [NO][NO]. .............. .......[NO][OKAY] - [OKAY][OKAY]....... - - .............[NO] [NO][NO]....... ..............[OKAY] -[OKAY][OKAY] - - [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -stochastic_transformer stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- -DeepSpeed C++/CUDA extension op report - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. ---------------------------------------------------DeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- --------------------------------------------------- -JIT compiled ops requires ninja-------------------------------------------------- -JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................. .................. .................................... [OKAY][OKAY] [OKAY] - - -[OKAY]-------------------------------------------------- - -------------------------------------------------------------------------------------------------------------------------------------------------------op name - - - op name................op nameop name installed................................................ ..installedinstalled installed compatible ...... - -------------------------------------------------- compatiblecompatible - -compatible - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -cpu_adam ............... cpu_adam[YES]cpu_adam cpu_adam.............................. ...... ............... [YES][YES] [OKAY] [YES] -............ ......[OKAY][OKAY] - -[OKAY] -fused_adam ............. [NO]fused_adam fused_adamfused_adam....... ..........................[OKAY] ............. - [NO][NO][NO] ..............fused_lamb ....... [OKAY][OKAY] ............. - -[OKAY][NO] - fused_lambfused_lamb....... ..........................fused_lamb[OKAY] -.............[NO][NO] [NO].............. .......[OKAY][OKAY] - -[OKAY] -sparse_attn ............ [NO] .......sparse_attnsparse_attnsparse_attn ............[OKAY]............ ............ -[NO] [NO][NO]....... .......transformer....... [OKAY] [OKAY] -............[OKAY] - -[NO]transformer transformer .......transformer ........................ ............ [OKAY][NO] [NO] -[NO] ..................... [OKAY][OKAY][OKAY] - -stochastic_transformer - stochastic_transformerstochastic_transformer .stochastic_transformer . .[NO] . [NO][NO] ....... [NO] ....... .......[OKAY]....... - [OKAY][OKAY][OKAY] - - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... -torch install path['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -............... torch version .................... 1.8.1 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch cuda version -DeepSpeed general environment info:torch install path - ............... torch version11.1 - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1 -torch version torch cuda version ............... 11.1.................... - nvcc version1.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -....................nvcc version 1.8.1..................... - 11.2 -torch cuda versiondeepspeed install path .......................... 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']nvcc version - deepspeed info..................... ...................11.2 -..................... torch cuda version11.2 -...............deepspeed install path 11.1........... -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -0.4.2+bc17042, bc17042, big-sciencedeepspeed install path - deepspeed wheel compiled w............ ...... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']torch 1.8, cuda 11.1 - - nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - deepspeed info11.2 -utils .................. [YES] ...... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... - deepspeed wheel compiled w. ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']...... -quantizer .............. [NO] ....... [OKAY] - torch 1.8, cuda 11.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science --------------------------------------------------- -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -.................... 1.8.1torch version - ....................torch cuda version ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.211.1 - -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install path deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -JIT compiled ops requires ninjaDeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info: -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - ------------------------------------------------------------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.---------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja - ---------------------------------------------------JIT compiled ops requires ninja -JIT compiled ops requires ninja - - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja --------------------------------------------------- -JIT compiled ops requires ninja - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - --------------------------------------------------- -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1DeepSpeed general environment info: -nvcc version -..................... 11.2 -deepspeed install path ...........torch install path ...............['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed wheel compiled w. ......torch version torch 1.8, cuda 11.1.................... - 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -op nameop nameop nameop name ................................................................ installedinstalledinstalledinstalled .... .. ..compatiblecompatiblecompatible - - -compatible---------------------------------------------------------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- - -cpu_adam cpu_adam...............cpu_adamcpu_adam ..............................[YES]............... [YES]......[YES][YES] ...... [OKAY]...... ...... - [OKAY] [OKAY] -[OKAY] - -fused_adam fused_adam............. fused_adam fused_adam.............[NO] ..........................[NO]....... [NO] ....... [NO][OKAY] ....... [OKAY] -....... - [OKAY][OKAY]fused_lamb - - fused_lamb............. fused_lambfused_lamb ............. [NO] .............[NO] ............. ....... [NO].......[NO] [OKAY] [OKAY].............. -[OKAY] - -[OKAY] -sparse_attnsparse_attnsparse_attnsparse_attn .................................... ............[NO] [NO] [NO] [NO].............. [OKAY][OKAY]....... - -DeepSpeed general environment info:DeepSpeed general environment info: - -....... [OKAY]transformer[OKAY] -transformer -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - - ........................transformer [NO]transformer [NO] ............ ................... ....... [NO][NO][OKAY][OKAY] - -.............. [OKAY][OKAY] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version ............... 11.1 -stochastic_transformerstochastic_transformer stochastic_transformerstochastic_transformer .. . [NO]. [NO] [NO]..............[NO] [OKAY].............. -[OKAY] -nvcc version torch cuda version..................... ...............11.2 -[OKAY][OKAY] - -11.1deepspeed install path - nvcc version........... ..................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ......11.2 -torch 1.8, cuda 11.1 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninja ninja .................. .................. .................................... [OKAY] [OKAY][OKAY] -[OKAY] - - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- -op name -op name op name op name................................ ................installedinstalled ................ ..installed..installed compatible..compatible.. - -compatible ---------------------------------------------------------------------------------------------------- -compatible - - --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adam cpu_adam.............................. cpu_adam [YES]............... [YES] ............... [YES] ...... ...... [YES] [OKAY]...... [OKAY] - -......[OKAY] -[OKAY] -fused_adamfused_adamfused_adam ..........................fused_adam ............. [NO][NO] [NO]........................... [NO].......[OKAY][OKAY] - - [OKAY]....... -fused_lambfused_lamb .............fused_lamb[OKAY] .............[NO] - ............. [NO] .......fused_lamb [NO] [OKAY]....... ............. - .......[OKAY][NO] -[OKAY]....... - [OKAY] -sparse_attn ............ sparse_attnsparse_attn[NO] sparse_attn............................... ............ [NO][OKAY] -[NO][NO]....... ..............transformer[OKAY] - [OKAY][OKAY]............ - transformer -[NO]transformer ...............................transformer [OKAY] [NO][NO] -............ .............. [NO]stochastic_transformer[OKAY][OKAY] - -....... .[OKAY]stochastic_transformer -stochastic_transformer [NO] .........stochastic_transformer [NO] [OKAY] [NO] -....... ........ [OKAY] [OKAY] -[NO] - ....... [OKAY] ----------------------------------------------------------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - --------------------------------------------------- ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja -ninjaninjaninjaninja .................................... .................. ..................[OKAY] [OKAY] -[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name -op name - ................op name................ op name installedinstalled ................ ................ .. installed ..installed compatible .. -compatible.. --------------------------------------------------- compatible - -----------------------------------------------------------------------------------------------------compatible - - --------------------------------------------------- -cpu_adam ............... cpu_adam[YES] cpu_adam...... cpu_adam............... ............... [OKAY] [YES] -...............[YES] ............[YES] [OKAY][OKAY]...... - - fused_adam[OKAY] -............. [NO] ....... [OKAY] -fused_adam ............. fused_adamfused_lamb[NO]fused_adam ....................................... ....... [NO] [NO][OKAY] [NO] ....... - ....... ....... [OKAY]fused_lamb[OKAY] - -[OKAY]............. - fused_lamb[NO] fused_lamb ............. ....... ............. [NO][OKAY] sparse_attn[NO] -....... ...................[OKAY] -[OKAY][NO] - ....... [OKAY] -sparse_attn ............ transformer[NO] ................... [NO][OKAY] sparse_attn -....... sparse_attn ............ transformer[OKAY] ............ -[NO] ............[NO]....... stochastic_transformer[NO].......[OKAY] -.......[OKAY]. [OKAY] -[NO] - ....... transformertransformerstochastic_transformer [OKAY] ............ -............ .[NO][NO] [NO].............. .......[OKAY][OKAY] - -[OKAY] -stochastic_transformerstochastic_transformer .. [NO] [NO]....... .......[OKAY] - [OKAY] -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adamcpu_adam .............................. [YES][YES] ............ [OKAY][OKAY] - -fused_adamfused_adam .......................... [NO] [NO]....... .......[OKAY] -[OKAY] -fused_lamb fused_lamb............. .............[NO] [NO]....... .......[OKAY] -[OKAY] -sparse_attn sparse_attn............ ............[NO] [NO]....... .......[OKAY] -[OKAY] -transformer transformer............ ............[NO] [NO]....... .......[OKAY] -[OKAY] -stochastic_transformer stochastic_transformer . [NO]. .......[NO] [OKAY]....... - [OKAY] -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 --------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 ---------------------------------------------------DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op reportJIT compiled ops requires ninja --------------------------------------------------- - --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - -JIT compiled ops requires ninja-------------------------------------------------- - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. ............... -[NO] ....... [NO] -async_iotransformer_inference ................. [NO][NO] .............. [NO][OKAY] - -utils .................. transformer_inference[YES] ........ [NO][OKAY] -....... [OKAY]quantizer - .............. [NO] ....... [OKAY]utils - .................. [YES] --------------------------------------------------...... - [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -async_io ............... [NO] ....... [NO] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -transformer_inference .. [NO] ....... [OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] -ninja .................. [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -op name ................ installed .. compatible --------------------------------------------------- --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -ninja .................. [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -cpu_adam ............... [YES] ...... [OKAY] -fused_adam ............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_lamb ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report-------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja - - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -JIT compiled ops requires ninjaJIT compiled ops requires ninja -JIT compiled ops requires ninja - -ninjaninja .................................... [OKAY][OKAY] - ----------------------------------------------------------------------------------------------------- - -op nameop name ................................ installedinstalled .... compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam cpu_adam............... ...............[YES] [YES]...... ......[OKAY] -[OKAY] -fused_adam .............fused_adam [NO] .................... [NO][OKAY] -....... [OKAY] -fused_lamb .............fused_lamb [NO]............. .......[NO] [OKAY]....... - [OKAY] -sparse_attn ............ sparse_attn[NO] ................... [NO][OKAY] -....... [OKAY] -transformer ............transformer [NO]............ .......[NO] [OKAY]....... - [OKAY] -stochastic_transformerstochastic_transformer .. [NO][NO] .............. [OKAY][OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -ninjaninjaninjaninja .................. .................................... .................. [OKAY] [OKAY] -[OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- - -----------------------------------------------------------------------------------------------------op name -op name - op name................op name ................ ................................installedinstalled installed installed...... compatible compatible..compatible - - - ----------------------------------------------------------------------------------------------------compatible --------------------------------------------------- - - --------------------------------------------------- -cpu_adam ...............cpu_adamcpu_adam cpu_adam[YES] ................................................... [YES] [OKAY][YES]...... - [YES] ...... [OKAY] ...... - [OKAY][OKAY] - -fused_adam ............. [NO]fused_adam ....... .............fused_adam[OKAY] -[NO]fused_adam............. .......[NO] fused_lamb [OKAY]....... ............. - .............[OKAY] fused_lamb -[NO] ....................fused_lamb[NO] [OKAY] [NO]....... -.................... [NO][OKAY] -[OKAY]....... - [OKAY] -fused_lamb .............sparse_attn [NO]............ .......sparse_attn [NO] [OKAY]sparse_attn................... ............[OKAY][NO] - -[NO] ..............transformer [OKAY][OKAY]............ - - [NO] .......transformer transformer[OKAY] -........................ [NO][NO] ..............sparse_attn stochastic_transformer[OKAY][OKAY]............ - [NO] -.stochastic_transformer [NO] stochastic_transformer .............. . .[OKAY] - [NO][OKAY] [NO] -....... .......[OKAY] -[OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ......quantizer [OKAY] -.............. [NO] ....... [OKAY] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO]async_io - ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY]utils - .................. [YES] quantizer...... ..............[OKAY] -[NO] ....... quantizer[OKAY] -.............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc version ..................... 11.2 -async_io ............... transformer_inference[NO] ......... [NO][NO] -....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -utils ..................transformer_inference [YES].. ......[NO] [OKAY]....... -deepspeed wheel compiled w. ......nvcc version torch 1.8, cuda 11.1..................... - [OKAY] - 11.2 -quantizer .............. [NO] utils....... [OKAY].................. -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - [YES] ...... --------------------------------------------------[OKAY] - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -DeepSpeed general environment info:torch install path ............... - torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc version ..................... 11.2 -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc versiontorch cuda version .................................... 11.211.1 - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install pathnvcc version ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -ninjaninjaninjaninja ........................................................................ [OKAY] [OKAY][OKAY] - -[OKAY] - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -transformer_inference .. [NO] ....... [OKAY] -op nameop nameop name op name ................ ................................ ................ installed installedinstalled installed .... .. .. compatiblecompatiblecompatible - -utils .................. [YES] ...... [OKAY] - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -quantizer .............. [NO] ....... [OKAY] - compatible --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adamcpu_adam ............................................. [YES][YES] [YES]...... ............[OKAY] cpu_adam -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - [OKAY] -[OKAY]............... - [YES] fused_adam...... .............[OKAY] [NO] - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -fused_adam .................... fused_adam[OKAY] [NO] -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - - ....................fused_lamb [NO][OKAY]............. -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - fused_adam[NO]fused_lamb....... .................................[OKAY] -[NO] [OKAY] [NO] -....... fused_lamb .......[OKAY] -............. [OKAY][NO] - ....... [OKAY]sparse_attn - ............ [NO]fused_lamb .................... sparse_attn [OKAY] [NO] -............ [NO] ....... [OKAY]transformer - sparse_attn................... ............ [NO] transformer[NO] [OKAY] .......................... - [OKAY][OKAY][NO] - - ....... [OKAY] -stochastic_transformertransformer stochastic_transformer . ............ . [NO] sparse_attn .......[NO][NO] [OKAY].............. -............ [OKAY][OKAY] - -[NO] ....... [OKAY]stochastic_transformer - . transformer[NO] ................... [OKAY] -[NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -ninja .................. [OKAY] -utils .................. [YES] ...... [OKAY] --------------------------------------------------- -op name ................ installed .. compatible --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] -cpu_adam ............... [YES] ...... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -fused_adam ............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] -ninjafused_lamb ............................... [OKAY][NO] - ....... --------------------------------------------------[OKAY] - -op name ................ installed .. compatible -transformer_inference .. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY]cpu_adam -utils .................. [YES] ...... [OKAY] - ............... [YES] ......transformer [OKAY]............ - [NO] ....... [OKAY] -stochastic_transformerfused_adam .............. [NO][NO] .............. [OKAY][OKAY] - -quantizer .............. [NO] ....... [OKAY] -fused_lamb ............. [NO] ....... [OKAY] --------------------------------------------------- -sparse_attn ............ [NO] ....... [OKAY] -transformer ............ [NO] ....... [OKAY] -stochastic_transformer . [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO]transformer_inference ......... [NO][NO] - ....... [OKAY] -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -quantizer utils.............. ..................[NO] [YES]....... ......[OKAY] -[OKAY] ---------------------------------------------------quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -...... [OKAY] -quantizer .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op reportNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja -DeepSpeed C++/CUDA extension op report --------------------------------------------------- - ---------------------------------------------------JIT compiled ops requires ninja - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -DeepSpeed general environment info: -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -nvcc versionnvcc version .......................................... 11.211.2 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -async_io ............... [NO] ....... [NO] -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -transformer_inference .. [NO] ....... [OKAY] -nvcc version ..................... 11.2 -utils .................. [YES] ...... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -quantizer .............. [NO] ....... [OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`.[NO] - -transformer_inference .. [NO] ....... [OKAY] -async_io ...............utils [NO].................. .......[YES] [NO]...... - [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer_inference-------------------------------------------------- -.. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. [YES] quantizer...... ..............[OKAY] -[NO] ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. ............... [NO] - ....... [NO] -transformer_inference .. [NO] ....... async_io[OKAY] -............... [NO] ....... [NO]utils - .................. [YES] ...... [OKAY] -quantizer .............. [NO] .......transformer_inference [OKAY].. - [NO] .......-------------------------------------------------- -[OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO] -............... [NO] ....... [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -ninjaninjaninjaninja .................. .................. .................................... [OKAY] [OKAY] -[OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name ---------------------------------------------------op name - ................................op name op name installed installed................ .................. .. installedcompatiblecompatibleinstalled - - --------------------------------------------------.. ---------------------------------------------------.. - compatiblecompatible - ----------------------------------------------------------------------------------------------------- - -cpu_adam ...............cpu_adam [YES]............... ......[YES]cpu_adam cpu_adam[OKAY]...... ............... - ............... [OKAY] -[YES][YES] ............ [OKAY][OKAY] - -fused_adam fused_adam............. .............[NO] [NO]....... .......[OKAY]fused_adam -fused_adam [OKAY] ............. -fused_lamb............. [NO].............[NO] fused_lamb ....... [NO]....... ............. [OKAY][OKAY].......[NO] - -[OKAY]....... - [OKAY]fused_lamb -fused_lamb ............. ............. [NO][NO] .............. [OKAY][OKAY] -sparse_attn - ............sparse_attn [NO]............ .......[NO] [OKAY]....... - [OKAY] -transformer sparse_attntransformer............ sparse_attn ............[NO] ............ ............[NO].......[NO] [NO][OKAY].............. - .......[OKAY][OKAY] - -[OKAY]stochastic_transformer - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - transformer.stochastic_transformer transformer[NO]............. .......[NO][NO] ............ [OKAY] .............. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -[NO] [OKAY] [OKAY] - -....... [OKAY] -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -stochastic_transformer . stochastic_transformer[NO] ........ [OKAY][NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] - ....... [OKAY] -.. [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] quantizer...... ..............[OKAY] -[NO] ....... [OKAY]quantizer - .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja ----------------------------------------------------------------------------------------------------- ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.JIT compiled ops requires ninja-------------------------------------------------- - - ---------------------------------------------------JIT compiled ops requires ninja - -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY] -[OKAY] - - ------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------- - - -op nameop name op nameop name ................ ................ ................ ................installed installed installed.. installed .. .. ..compatiblecompatible -compatible -compatible-------------------------------------------------- --------------------------------------------------- - --------------------------------------------------- --------------------------------------------------- - -cpu_adam cpu_adamcpu_adam............... cpu_adam ............... ...............[YES] ...............[YES][YES] .................. [YES] [OKAY][OKAY][OKAY] -...... - - [OKAY] -fused_adamfused_adam fused_adam ............. fused_adam.......................... [NO] .............[NO][NO] .....................[NO] [OKAY] [OKAY] -[OKAY] -....... - fused_lamb[OKAY]fused_lamb - .............fused_lamb............. fused_lamb [NO][NO] ............. ............. ..............[NO] [NO] [OKAY] [OKAY]....... - -....... [OKAY] -[OKAY] -sparse_attnsparse_attn ............sparse_attn sparse_attn............ ............ [NO] ............ [NO][NO]....... ..............[OKAY] [NO] -[OKAY] -[OKAY]....... - [OKAY]transformer - transformertransformer............ transformer............[NO]............ ....... [NO] ............[NO] [OKAY] ....... -[NO] ....... [OKAY] ....... -[OKAY] stochastic_transformer -[OKAY] stochastic_transformer - stochastic_transformer. stochastic_transformer.[NO] . .......[NO][NO]. [OKAY].............. [NO][OKAY] - -[OKAY]....... - [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... async_io[NO] -............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] -.................. [YES] ......quantizer [OKAY].............. - [NO] ....... quantizer[OKAY] -.............. [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- - -----------------------------------------------------------------------------------------------------JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils ..................utils .................. [YES] ...... [YES][OKAY] -......quantizer .............. [OKAY][NO] - ....... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - -async_io ............... [NO] ....... [NO] -op nameop nameop name op name ................ ................................ ................ installed installedinstalled installed .... .. .. compatible compatiblecompatible -compatible - --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - - --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -cpu_adamcpu_adam cpu_adamcpu_adam.............................. ...............[YES]...............[YES] [YES] ............[YES] ......[OKAY]...... [OKAY][OKAY] - -[OKAY] - -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -fused_adamfused_adam fused_adamfused_adam ............. .......................... ............. [NO][NO] [NO] [NO]....... ....... .............. [OKAY] [OKAY] -[OKAY][OKAY] - - -fused_lamb .............fused_lambfused_lamb fused_lamb [NO] ............. ............. [NO].................... [NO] ....... [NO][OKAY][OKAY]....... - - .......[OKAY] -[OKAY] -sparse_attnsparse_attn ............ ............[NO]sparse_attn sparse_attn .......[NO] ............ [OKAY] ................... -[NO] [OKAY] [NO]transformer --------------------------------------------------- -DeepSpeed C++/CUDA extension op report-------------------------------------------------- ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -DeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - ....... ............ .......transformer [OKAY] [NO] -[OKAY]............ --------------------------------------------------- ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -JIT compiled ops requires ninja - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaJIT compiled ops requires ninjaJIT compiled ops requires ninja - - - transformertransformer[NO]....... ............[OKAY]................... - [NO][NO][OKAY] -..............stochastic_transformer [OKAY]stochastic_transformer [OKAY] - -. .[NO]stochastic_transformerstochastic_transformer ....... ..[NO][OKAY] -[NO].......[NO] ..............[OKAY] -[OKAY][OKAY] - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] .......transformer_inference [NO] -.. [NO] ....... [OKAY] -transformer_inference .. utils[NO] ......................... [YES][OKAY] -...... [OKAY] -utils quantizer.................. ..............[YES] [NO]...... .......[OKAY] -[OKAY] -quantizer .............. [NO]-------------------------------------------------- -....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] .......transformer_inference [OKAY].. - [NO] ....... [OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizer ..............quantizer [NO].............. ....... [OKAY] - [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO]............... .......[NO] .......[NO] -[NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inferenceutils .................... [YES] ...... [OKAY] -quantizer ..............[NO] [NO]....... .......[OKAY] - [OKAY] ---------------------------------------------------utils - .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_iotransformer_inference ................. [NO][NO] .............. [NO][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -utils .................. [YES] ......transformer_inference [OKAY].. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [NO] ....... quantizer[OKAY] -.............. [NO] ....... [OKAY]utils -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] - .................. [YES] --------------------------------------------------...... - [OKAY] -transformer_inference .. [NO] ....... [OKAY] -quantizer .............. [NO] ....... [OKAY] -transformer_inference ..utils [NO].................. .......[YES] [OKAY]...... --------------------------------------------------- - [OKAY] -quantizer utils.............. ..................[NO] [YES]....... ......[OKAY] -[OKAY] ---------------------------------------------------quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -async_io async_io............... ...............[NO] [NO]....... .......[NO] [NO] - --------------------------------------------------- -transformer_inference .. [NO]transformer_inference ....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES]utils ........................ [OKAY] -[YES] ......quantizer [OKAY].............. - [NO] ....... [OKAY] -quantizer --------------------------------------------------.............. - [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY]async_io --------------------------------------------------- -............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -quantizer .............. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils --------------------------------------------------.................. [YES] ...... [OKAY] - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -ninjaninjaninjaninja ........................................................................ [OKAY][OKAY][OKAY][OKAY] - - - -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - - --------------------------------------------------- -async_io ............... [NO] ....... [NO] -op nameop nameop nameop name ................................................................ installedinstalledinstalledinstalled ...... .. compatible compatiblecompatible - -compatible ----------------------------------------------------------------------------------------------------- - --------------------------------------------------- - --------------------------------------------------- -cpu_adam cpu_adam...............cpu_adam cpu_adam...............[YES]............... ............... [YES][YES]...... ...... [YES]...... [OKAY] [OKAY] -[OKAY]...... - -transformer_inference .. [NO] ....... [OKAY] - [OKAY] -utils .................. [YES] ...... [OKAY] -fused_adamfused_adamfused_adam fused_adam............. ............. ............. [NO].............[NO] [NO][NO]....... ....... .............. [OKAY] [OKAY] -[OKAY][OKAY] - - -quantizer .............. [NO] ....... [OKAY] -fused_lamb fused_lambfused_lamb.............fused_lamb .............[NO].......................... [NO].......[NO][NO] [OKAY] ....... --------------------------------------------------- -....... .......[OKAY][OKAY] - -[OKAY] -sparse_attn ............sparse_attnsparse_attn sparse_attn [NO]............ ............ ............ ....... [NO] [NO][NO] [OKAY] .............. -....... [OKAY][OKAY][OKAY] - -transformer - ............transformertransformer [NO]transformer ............ ............ ....... ............ [NO][NO][NO][OKAY] -..................... [OKAY][OKAY][OKAY] - - -stochastic_transformer stochastic_transformerstochastic_transformer .stochastic_transformer .[NO]. . .......[NO] [NO][NO] [OKAY] .............. -....... [OKAY][OKAY][OKAY] - - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO]transformer_inference - .. [NO] ....... [OKAY] -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -quantizer .............. [NO] .......utils [OKAY].................. - [YES] ...... --------------------------------------------------[OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. [NO] ....... [OKAY] -async_io ............... [NO] ....... [NO] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......async_io [NO]  [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -............... [NO] - ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference async_io..utils ...............[NO].................. .......[NO][YES] [OKAY]............. - [NO][OKAY] - -quantizerutils ................................ [NO] [YES]....... ......transformer_inference [OKAY][OKAY] - -.. [NO]-------------------------------------------------- -.......quantizer [OKAY].............. - [NO] ....... [OKAY] -utils --------------------------------------------------.................. - [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] async_io....... [NO]............... - [NO] ....... [NO] -transformer_inference .. transformer_inference[NO] ......... [NO][OKAY] -....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY]quantizer - .............. quantizer[NO] ..................... [NO][OKAY] -....... [OKAY] --------------------------------------------------- --------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report - - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - ---------------------------------------------------DeepSpeed C++/CUDA extension op report---------------------------------------------------------------------------------------------------- - - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja--------------------------------------------------JIT compiled ops requires ninja - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer_inferencetransformer_inference .... [NO] [NO]....... .......[OKAY] -[OKAY] -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- - [OKAY] --------------------------------------------------- -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] ....... [NO] -async_io ............... [NO]async_io ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... -utils .................. [YES] ...... [OKAY] - [OKAY] -quantizer .............. [NO] ....... [OKAY] -utils .................. [YES] ......utils [OKAY].................. - [YES] ...... quantizer[OKAY] -.............. [NO] ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... --------------------------------------------------- - [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [OKAY] - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -transformer_inference .. [NO] ....... [OKAY] -async_io ............... [NO] utils....... ..................[NO] -[YES] ...... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -quantizer .............. transformer_inference[NO] ......... [NO][OKAY] - ....... [OKAY] --------------------------------------------------- -async_io ............... [NO] ....... [NO] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ...................................................... ..................[OKAY] -[OKAY][OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -op name - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -op name op name ................op name................ installed................installed................ ....installedinstalled compatible ..compatible -.. - --------------------------------------------------compatiblecompatible-------------------------------------------------- - - - ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -cpu_adamcpu_adam cpu_adamcpu_adam.............................. ..............................[YES][YES] [YES][YES]............ ............[OKAY][OKAY] - -[OKAY][OKAY] - -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -fused_adam fused_adam............. fused_adam ............. fused_adam[NO] ............. .............[NO].......[NO] [NO].............. [OKAY] ....... -[OKAY][OKAY] - -[OKAY] -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -fused_lamb fused_lambfused_lambfused_lamb............. .......................................[NO] [NO][NO][NO]....... ....... [OKAY] ..............[OKAY] - -[OKAY][OKAY] - -quantizer .............. [NO]utils ......................... [OKAY][YES] -sparse_attn sparse_attn............ sparse_attnsparse_attn............[NO] ........................[NO]....... [NO][NO].......[OKAY] - ...... [OKAY] --------------------------------------------------- -.......[OKAY]....... - [OKAY][OKAY] -transformer -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - transformer............transformer ............[NO]............transformer .......[NO][NO]............ [OKAY]..............[NO] - [OKAY][OKAY]....... - - [OKAY] -stochastic_transformer stochastic_transformerstochastic_transformer. stochastic_transformer [NO]. . ....... . [NO][NO] [OKAY] [NO] -.............. .......[OKAY] -[OKAY][OKAY] - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. --------------------------------------------------- -----------------------------------------------------------------------------------------------------DeepSpeed C++/CUDA extension op report - - -DeepSpeed C++/CUDA extension op report--------------------------------------------------DeepSpeed C++/CUDA extension op report - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system -async_io ............... [NO] ....... [NO] - meet the required dependencies to JIT install the op. - - -JIT compiled ops requires ninja-------------------------------------------------- --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -JIT compiled ops requires ninja - -JIT compiled ops requires ninja -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -/bin/sh: line 0: type: git: not found -async_io ...............async_io [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] --------------------------------------------------- --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ...................... [NO][NO] -async_ioasync_io ............... ...............[NO] [NO]....... .......[NO] -[NO] - ....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -transformer_inference .. [NO]transformer_inference ......... [OKAY][NO] - ....... [OKAY] -utils .................. utils[YES] ........................ [YES][OKAY] -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -......quantizer [OKAY].............. -[OKAY] - [NO] .......quantizer [OKAY].............. -quantizer ..............quantizer [NO].............. .......[NO] [OKAY]....... - [OKAY] - [NO] ....... --------------------------------------------------[OKAY] - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] async_io....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY]utils - .................. [YES]quantizer .................... [OKAY][NO] - ....... [OKAY]quantizer - .............. [NO] --------------------------------------------------....... - [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... [NO]............... .......[NO] [NO]....... - [NO] -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja ...................................................... .................. [OKAY][OKAY][OKAY] - - -[OKAY]---------------------------------------------------------------------------------------------------- - --------------------------------------------------- -op name-------------------------------------------------- -op name - op name................................op name ................installedinstalled................ installed....installed compatiblecompatible.... - - --------------------------------------------------compatiblecompatible --------------------------------------------------- - - --------------------------------------------------- --------------------------------------------------- -cpu_adamcpu_adam cpu_adam...............cpu_adam............... [YES] ............... .....................[YES] [OKAY][YES][YES] -...... ............[OKAY] [OKAY][OKAY] - - -fused_adam ............. [NO] ....... [OKAY]fused_adamfused_adam -fused_adam ....................................... fused_lamb[NO][NO] [NO].................... ....... [NO]....... [OKAY] [OKAY][OKAY] - -....... - [OKAY] -fused_lambfused_lambfused_lamb ....................................... [NO][NO][NO] ....... ....... ....... sparse_attn[OKAY] [OKAY] [OKAY] - - -............ [NO] ....... [OKAY] -transformer ............ [NO] .......sparse_attn sparse_attnsparse_attn [OKAY] -.................................... [NO][NO][NO]stochastic_transformer ............... ....... [OKAY] -[NO][OKAY][OKAY] - -.......transformer [OKAY]transformer............ -transformer ............[NO] ............ [NO] ....... [NO].......[OKAY] -.......[OKAY] - [OKAY] -stochastic_transformer stochastic_transformerstochastic_transformer. [NO].. .......[NO][NO] [OKAY].............. - [OKAY][OKAY] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -/bin/sh: line 0: type: git: not found -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO] ...................... [NO][NO] -....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] ...... utils[OKAY] - .................. [YES] ......quantizer [OKAY].............. - [NO] ....... [OKAY]quantizer - .............. [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- - - -DeepSpeed C++/CUDA extension op reportDeepSpeed C++/CUDA extension op report -DeepSpeed C++/CUDA extension op report - -DeepSpeed C++/CUDA extension op report-------------------------------------------------- --------------------------------------------------- - - ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.--------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.-------------------------------------------------- - - - ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- -JIT compiled ops requires ninja - - -JIT compiled ops requires ninjaJIT compiled ops requires ninja - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO] ....... transformer_inference[OKAY] -.. [NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... quantizer[OKAY] -.............. [NO] .......quantizer [OKAY].............. - [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -ninjaninjaninjaninja .................................... ..................[OKAY][OKAY].................. - -[OKAY] ----------------------------------------------------------------------------------------------------[OKAY] - - - -----------------------------------------------------------------------------------------------------op nameop name - - op nameop name................................ ................................installed installed installedinstalled.... ....compatiblecompatible - -compatible--------------------------------------------------compatible - --------------------------------------------------- --------------------------------------------------- --------------------------------------------------- - -cpu_adam cpu_adam...............cpu_adam cpu_adam[YES] ................................................... [YES][OKAY][YES][YES] - .................. [OKAY][OKAY] -[OKAY] - -fused_adam ............. [NO] ....... [OKAY] -fused_adamfused_adamfused_lambfused_adam .................................................... [NO][NO][NO][NO] ..................... ....... [OKAY] [OKAY][OKAY] -[OKAY] - - -fused_lamb fused_lambfused_lamb............. ............. ............. [NO] [NO] sparse_attn [NO] .......................... .......[OKAY][OKAY] -[NO] - [OKAY]....... - [OKAY] -transformer ............ sparse_attnsparse_attn[NO] sparse_attn................... ............ ............ [OKAY][NO] [NO] -[NO] ....... ....... ....... stochastic_transformer[OKAY] [OKAY][OKAY] - -. -transformer [NO]............ transformertransformer....... ............[NO]............[OKAY] -[NO].......[NO] .......[OKAY]....... - [OKAY][OKAY] - -stochastic_transformer stochastic_transformer. stochastic_transformer [NO]. ........[NO] [OKAY][NO]....... - [OKAY] -....... [OKAY] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -using world size: 256, data-parallel-size: 8, tensor-model-parallel size: 4, pipeline-model-parallel size: 8 -using torch.float16 for parameters ... ------------------------- arguments ------------------------ - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - accumulate_allreduce_grads_in_fp32 .............. False - adam_beta1 ...................................... 0.9 - adam_beta2 ...................................... 0.999 - adam_eps ........................................ 1e-08 - adlr_autoresume ................................. False - adlr_autoresume_interval ........................ 1000 - apply_query_key_layer_scaling ................... True - apply_residual_connection_post_layernorm ........ False - attention_dropout ............................... 0.1 - attention_softmax_in_fp32 ....................... False - bert_binary_head ................................ True - bert_load ....................................... None - bf16 ............................................ False - bias_dropout_fusion ............................. True - bias_gelu_fusion ................................ True - biencoder_projection_dim ........................ 0 - biencoder_shared_query_context_model ............ False - block_data_path ................................. None - checkpoint_activations .......................... True - checkpoint_in_cpu ............................... False - checkpoint_num_layers ........................... 1 - clip_grad ....................................... 1.0 - codecarbon_dir .................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/codecarbon - consumed_train_samples .......................... 0 - consumed_valid_samples .......................... 0 - contigious_checkpointing ........................ False - cpu_optimizer ................................... False - cpu_torch_adam .................................. False - data_impl ....................................... mmap - data_parallel_size .............................. 8 - data_path ....................................... ['/gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document'] - dataloader_type ................................. single - DDP_impl ........................................ local - decoder_seq_length .............................. None - deepscale ....................................... False - deepscale_config ................................ None - deepspeed ....................................... True - deepspeed_activation_checkpointing .............. True - deepspeed_config ................................ ./ds_config.1289770.json - deepspeed_mpi ................................... False - distribute_checkpointed_activations ............. False - distributed_backend ............................. nccl - embedding_path .................................. None - encoder_seq_length .............................. 2048 - eod_mask_loss ................................... False - eval_interval ................................... 1000 - eval_iters ...................................... 5 - evidence_data_path .............................. None - exit_duration_in_mins ........................... 1190 - exit_interval ................................... None - ffn_hidden_size ................................. 20480 - finetune ........................................ False - fp16 ............................................ True - fp16_lm_cross_entropy ........................... False - fp32_residual_connection ........................ False - global_batch_size ............................... 2048 - hidden_dropout .................................. 0.1 - hidden_size ..................................... 16384 - hysteresis ...................................... 2 - ict_head_size ................................... None - ict_load ........................................ None - img_dim ......................................... 224 - indexer_batch_size .............................. 128 - indexer_log_interval ............................ 1000 - init_method_std ................................. 0.02 - init_method_xavier_uniform ...................... False - initial_loss_scale .............................. 4294967296 - kv_channels ..................................... 512 - layernorm_epsilon ............................... 1e-05 - lazy_mpu_init ................................... None - load ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - local_rank ...................................... 0 - log_batch_size_to_tensorboard ................... True - log_interval .................................... 10 - log_learning_rate_to_tensorboard ................ True - log_loss_scale_to_tensorboard ................... True - log_num_zeros_in_grad ........................... False - log_params_norm ................................. False - log_timers_to_tensorboard ....................... True - log_validation_ppl_to_tensorboard ............... True - loss_scale ...................................... 12.0 - loss_scale_window ............................... 1000 - lr .............................................. 6e-05 - lr_decay_iters .................................. None - lr_decay_samples ................................ 126953125 - lr_decay_style .................................. cosine - lr_warmup_fraction .............................. None - lr_warmup_iters ................................. 0 - lr_warmup_samples ............................... 216320 - make_vocab_size_divisible_by .................... 128 - mask_prob ....................................... 0.15 - masked_softmax_fusion ........................... True - max_position_embeddings ......................... 2048 - memory_centric_tiled_linear ..................... False - merge_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-merges.txt - micro_batch_size ................................ 1 - min_loss_scale .................................. 1.0 - min_lr .......................................... 6e-06 - mmap_warmup ..................................... False - no_load_optim ................................... None - no_load_rng ..................................... None - no_save_optim ................................... None - no_save_rng ..................................... None - num_attention_heads ............................. 32 - num_channels .................................... 3 - num_classes ..................................... 1000 - num_layers ...................................... 32 - num_layers_per_virtual_pipeline_stage ........... None - num_workers ..................................... 2 - onnx_safe ....................................... None - openai_gelu ..................................... False - optimizer ....................................... adam - override_lr_scheduler ........................... False - params_dtype .................................... torch.float16 - partition_activations ........................... False - patch_dim ....................................... 16 - pipeline_model_parallel_size .................... 8 - position_embedding_type ......................... PositionEmbeddingType.absolute - profile_backward ................................ False - query_in_block_prob ............................. 0.1 - rampup_batch_size ............................... ['16', '16', '6_000_000'] - rank ............................................ 0 - remote_device ................................... none - reset_attention_mask ............................ False - reset_position_ids .............................. False - retriever_report_topk_accuracies ................ [] - retriever_score_scaling ......................... False - retriever_seq_length ............................ 256 - sample_rate ..................................... 1.0 - save ............................................ /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints - save_interval ................................... 300 - scatter_gather_tensors_in_pipeline .............. True - scattered_embeddings ............................ False - seed ............................................ 43 - seq_length ...................................... 2048 - sgd_momentum .................................... 0.9 - short_seq_prob .................................. 0.1 - split ........................................... 949,50,1 - split_transformers .............................. False - synchronize_each_layer .......................... False - tensor_model_parallel_size ...................... 4 - tensorboard_dir ................................. /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/tr8-104B-logs/tensorboard - tensorboard_log_interval ........................ 1 - tensorboard_queue_size .......................... 5 - tile_factor ..................................... 1 - titles_data_path ................................ None - tokenizer_name_or_path .......................... None - tokenizer_type .................................. GPT2BPETokenizer - train_iters ..................................... None - train_samples ................................... 300000000 - use_checkpoint_lr_scheduler ..................... False - use_contiguous_buffers_in_ddp ................... False - use_cpu_initialization .......................... None - use_one_sent_docs ............................... False - use_pin_memory .................................. False - virtual_pipeline_model_parallel_size ............ None - vocab_extra_ids ................................. 0 - vocab_file ...................................... /gpfswork/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/data/gpt2-vocab.json - weight_decay .................................... 0.1 - world_size ...................................... 256 - zero_allgather_bucket_size ...................... 0.0 - zero_contigious_gradients ....................... False - zero_reduce_bucket_size ......................... 0.0 - zero_reduce_scatter ............................. False - zero_stage ...................................... 1 --------------------- end of arguments --------------------- -will use batch size rampup starting from global batch size 16 to global batch size 2048 with batch size increments 16 over 6000000 samples. -> building GPT2BPETokenizer tokenizer ... - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -/bin/sh: line 0: type: git: not found -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc version nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch version .................... 1.8.1 -DeepSpeed general environment info: -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.1 -1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 --------------------------------------------------- -DeepSpeed C++/CUDA extension op report --------------------------------------------------- ---------------------------------------------------NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. - ---------------------------------------------------DeepSpeed C++/CUDA extension op report - - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science ---------------------------------------------------JIT compiled ops requires ninja-------------------------------------------------- - - -NOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op.DeepSpeed C++/CUDA extension op report - ------------------------------------------------------------------------------------------------------------------------------------------------------- - - -JIT compiled ops requires ninjaNOTE: Ops not installed will be just-in-time (JIT) compiled at - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. -DeepSpeed C++/CUDA extension op report - ----------------------------------------------------------------------------------------------------- - -JIT compiled ops requires ninja -NOTE: Ops not installed will be just-in-time (JIT) compiled at - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - runtime if needed. Op compatibility means that your system - meet the required dependencies to JIT install the op. --------------------------------------------------- -JIT compiled ops requires ninja -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] .......async_io [NO] ............... - [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] .......utils [OKAY].................. - [YES] ...... [OKAY] -utils .................. quantizer[YES] .................... [NO][OKAY] -....... [OKAY] -quantizer .............. --------------------------------------------------[NO] - ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference ..transformer_inference [NO] .. .......[NO] [OKAY]....... - [OKAY] -utilsutils .................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... [OKAY] - ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -ninjaninjaninjaninja .................. ...................................................... [OKAY][OKAY][OKAY][OKAY] - - - ----------------------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------------------op name - - -op name................op nameop name ................................installed................ installedinstalledinstalled.. ..compatible.... - compatible --------------------------------------------------compatible - -compatible - ----------------------------------------------------------------------------------------------------- --------------------------------------------------- - -cpu_adam ............... [YES] ...... cpu_adamcpu_adamcpu_adam[OKAY] -............................................. [YES][YES][YES] .................. [OKAY][OKAY]fused_adam[OKAY] - - -............. [NO] ....... [OKAY] -fused_adam fused_lamb.............fused_adam fused_adam[NO] .......................... ............. [NO]....... [NO] [NO] .......[OKAY]....... - .......[OKAY][OKAY] -fused_lamb -[OKAY] -............. fused_lamb[NO] fused_lamb.................... [NO][OKAY]............. sparse_attn -.......[NO] [OKAY] ............ -....... [NO][OKAY] -....... [OKAY] -sparse_attn ............ transformer [NO]............ sparse_attn.......[NO] ............ [OKAY] ....... -[NO]sparse_attn [OKAY]...................transformer - [OKAY]............ -[NO] [NO]stochastic_transformer....... .......transformer[OKAY] . -[OKAY]............ - transformer[NO][NO] ...................stochastic_transformer ....... [NO] [OKAY] .[OKAY] - ....... -[NO] [OKAY]....... - stochastic_transformer[OKAY] -stochastic_transformer. [NO] ........ [NO][OKAY] -....... [OKAY] -DeepSpeed general environment info: -torch install path ...............DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install path torch version............... .................... 1.8.1 -torch cuda version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']............... - 11.1 -torch versionnvcc version ......................................... 1.8.111.2 - -deepspeed install path torch cuda version........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.1 - -deepspeed infonvcc version ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] ....... [NO] -async_ioasync_io .............................. [NO]transformer_inference[NO] ................ [NO][NO][NO] - -....... [OKAY] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES]transformer_inference ........ [OKAY][NO] -utils .................. [YES] ...... [OKAY] - transformer_inference....... ..[OKAY]quantizer -quantizer .............. [NO] ....... [OKAY] - [NO].............. .......[NO] [OKAY]....... -utils [OKAY].................. --------------------------------------------------- - [YES] ......-------------------------------------------------- utils[OKAY] - -.................. [YES] ...... quantizer[OKAY] -.............. [NO] .......quantizer [OKAY].............. - [NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -DeepSpeed general environment info:DeepSpeed general environment info: - -utils .................. [YES] ...... [OKAY] -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -quantizer .............. [NO] ....... [OKAY] -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - --------------------------------------------------- -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... [NO] .......transformer_inference [NO].. - [NO] ....... [OKAY] -utilstransformer_inference .................... [YES][NO] ............. [OKAY][OKAY] - -quantizer .............. utils[NO] ......................... [OKAY][YES] - ...... [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1DeepSpeed general environment info: -nvcc version -..................... 11.2 -deepspeed install pathtorch install path ........... ............... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -deepspeed wheel compiled w. ......torch version torch 1.8, cuda 11.1.................... - 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -torch cuda version ............... 11.1 -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.1 -11.1 -nvcc versionnvcc version .......................................... 11.211.2 - -torch cuda version ............... 11.1 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ...............async_io [NO]............... .......[NO] [NO]....... - [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils utils.................. ..................[YES] [YES]...... [OKAY]...... - [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -DeepSpeed general environment info: -async_io ............... async_io[NO] ...................... [NO][NO] - ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference ..utils [NO].................. .......[YES] [OKAY]...... -torch version .................... 1.8.1 - [OKAY] -torch cuda version ............... 11.1 -quantizer utils.............. ..................[NO] [YES]....... ......[OKAY] -[OKAY] -nvcc version ..................... 11.2 --------------------------------------------------- -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -quantizer .............. [NO] ....... [OKAY] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -/bin/sh: line 0: type: git: not found -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ....... ...............[NO] -[NO] ....... [NO] -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ............ [OKAY][OKAY] - -quantizerquantizer .............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] ----------------------------------------------------------------------------------------------------- - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -async_io ............... transformer_inference[NO] ......... [NO][NO] -....... [OKAY] -utils transformer_inference.................. ..[YES] [NO]...... .......[OKAY] -[OKAY] -quantizer ..............utils [NO].................. .......[YES] [OKAY]...... - [OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference transformer_inference.. ..[NO] [NO]....... .......[OKAY] -[OKAY] -utilsutils .................................... [YES][YES] ...... ......[OKAY] -[OKAY] -quantizerquantizer ............................ [NO][NO] .............. [OKAY][OKAY] - --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... DeepSpeed general environment info:['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -deepspeed info ................... 0.4.2+bc17042, bc17042, big-sciencetorch install path - deepspeed wheel compiled w................ ...... torch 1.8, cuda 11.1 -async_io ............... [NO] ....... [NO] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -utils .................. [YES] ...... [OKAY] -nvcc version ..................... 11.2 -quantizer .............. [NO] ....... [OKAY] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] --------------------------------------------------- -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info:['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -DeepSpeed general environment info:torch version .................... - torch install path1.8.1 -............... torch cuda version torch install path............... 11.1............... - ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']nvcc version - ..................... torch version11.2 -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']....................deepspeed install path - 1.8.1........... -torch version torch cuda version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'].................... - ...............deepspeed info1.8.1 -11.1................... - torch cuda version0.4.2+bc17042, bc17042, big-sciencenvcc version - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - ....................................deepspeed wheel compiled w. 11.111.2 -...... -deepspeed install path nvcc version torch 1.8, cuda 11.1 ........... -..................... 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -async_io ............... [NO] ....... [NO] - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -transformer_inference .. [NO] ....... [OKAY] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -utils .................. [YES]async_io ...... ...............[OKAY] -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -[NO] ....... quantizer[NO] -.............. [NO] ....... [OKAY] ---------------------------------------------------transformer_inference - .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] -DeepSpeed general environment info: --------------------------------------------------- -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -/bin/sh: line 0: type: git: not found -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.1 -1.8.1 -torch cuda version torch cuda version............... ............... 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ...................0.4.2+bc17042, bc17042, big-science -0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - deepspeed wheel compiled w....... ......torch 1.8, cuda 11.1 -torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... DeepSpeed general environment info: -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch version ................................... 1.8.1 -torch cuda version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -............... 11.1torch version - ....................nvcc version 1.8.1..................... - 11.2torch cuda version - deepspeed install path............... ...........11.1 -nvcc version['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -.....................deepspeed info 11.2................... - deepspeed install path0.4.2+bc17042, bc17042, big-science ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info: -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io async_io............... ...............[NO] [NO]....... .......[NO] -[NO] -transformer_inference ..transformer_inference [NO].. ....... [NO][OKAY] -....... [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY]quantizer - .............. [NO] quantizer....... ..............[OKAY] -[NO] ....... [OKAY]-------------------------------------------------- - --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:DeepSpeed general environment info: - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch version torch cuda version.................... ...............1.8.1 -11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - 11.2deepspeed info - ...................deepspeed install path 0.4.2+bc17042, bc17042, big-science........... -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install pathDeepSpeed general environment info: ............... - torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda versiontorch version ................................... 11.11.8.1 - -DeepSpeed general environment info: -nvcc version torch cuda version..................... ...............11.2 -11.1deepspeed install path - nvcc version........... ..................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']11.2 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -deepspeed infodeepspeed install path .............................. 0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed wheel compiled w. deepspeed info...... ...................torch 1.8, cuda 11.1 -torch cuda version ............... 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. DeepSpeed general environment info:...... torch 1.8, cuda 11.1 - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -1.8.1 -torch version ....................torch cuda version 1.8.1............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -deepspeed install pathnvcc version ................................DeepSpeed general environment info: 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed install path - deepspeed info........... ................... torch install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -...............deepspeed infodeepspeed wheel compiled w. ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']deepspeed wheel compiled w. - ...... torch versiontorch 1.8, cuda 11.1 -.................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -DeepSpeed general environment info:torch version .................... -1.8.1 -torch cuda versiontorch install path .............................. 11.1 -nvcc version ..................... 11.2['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed install path ........... torch version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'].................... -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science1.8.1 - -DeepSpeed general environment info: -deepspeed wheel compiled w. torch cuda version...... ...............torch 1.8, cuda 11.1 -11.1 -nvcc version ..................... 11.2 -torch install pathDeepSpeed general environment info: ............... -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch versiontorch cuda version ................................... 11.11.8.1 - -nvcc version .....................torch cuda version 11.2............... - deepspeed install path11.1 -...........nvcc version .....................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -11.2deepspeed info - deepspeed install path................... ...........0.4.2+bc17042, bc17042, big-science -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed wheel compiled w. - ......deepspeed info torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... ...............11.1 -11.1nvcc version - nvcc version..................... .....................11.2 -11.2deepspeed install path - deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - deepspeed info................... ................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.1 -11.1 -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version ............... ...............11.1 -11.1 -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 -torch cuda version -............... 11.1torch cuda version - ...............nvcc version .....................11.1 -11.2nvcc version - deepspeed install path..................... ...........11.2 -deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. - deepspeed info...... ...................torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -DeepSpeed general environment info: -torch install path ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']torch version - .................... 1.8.1torch version -.................... torch cuda version1.8.1 -............... torch cuda version11.1 -...............nvcc version 11.1..................... - nvcc version11.2 -.....................deepspeed install path 11.2........... - deepspeed install path['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -........... deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - deepspeed info0.4.2+bc17042, bc17042, big-science -...................deepspeed wheel compiled w. 0.4.2+bc17042, bc17042, big-science...... - deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ...............torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................torch version 1.8.1.................... - 1.8.1torch cuda version - ............... torch cuda version11.1 -............... nvcc version11.1 -..................... nvcc version11.2 -..................... deepspeed install path11.2 -........... deepspeed install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -................... deepspeed info0.4.2+bc17042, bc17042, big-science -................... deepspeed wheel compiled w.0.4.2+bc17042, bc17042, big-science -...... deepspeed wheel compiled w.torch 1.8, cuda 11.1 -...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed general environment info:DeepSpeed general environment info: - -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path torch install path............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'].................... - 1.8.1 -torch version ....................torch cuda version 1.8.1............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - deepspeed info........... ................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']0.4.2+bc17042, bc17042, big-science - -deepspeed infodeepspeed wheel compiled w. ......................... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils utils.................. ..................[YES] [YES]...... ......[OKAY] -[OKAY] -quantizer quantizer.............. ..............[NO] [NO]....... .......[OKAY] -[OKAY] --------------------------------------------------- --------------------------------------------------- -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO]async_io ...................... [NO][NO] - ....... [NO] -transformer_inference ..transformer_inference [NO].. .......[NO] [OKAY]....... - [OKAY] -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version ....................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -1.8.1 -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc versiontorch cuda version .................................... 11.211.1 - -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.10.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path -............... torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch versiontorch cuda version ................................... 1.8.111.1 - -nvcc versiontorch cuda version .................................... 11.211.1 - -DeepSpeed general environment info: -deepspeed install pathnvcc version ................................ 11.2['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed install pathdeepspeed info .............................. 0.4.2+bc17042, bc17042, big-science['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed wheel compiled w.deepspeed info ......................... torch 1.8, cuda 11.1 -0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 -torch cuda version ............... - 11.1 -torch cuda versionnvcc version .................................... 11.111.2 - -nvcc versiondeepspeed install path ................................ 11.2 -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed install path - ...........deepspeed info ...................['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -0.4.2+bc17042, bc17042, big-sciencedeepspeed info -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1................... - 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -DeepSpeed general environment info: -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -/bin/sh: line 0: type: git: not found -nvcc version ..................... 11.2 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] .......async_io [NO]............... - [NO] ....... [NO] -transformer_inference .. [NO] ....... [OKAY] -transformer_inference .. [NO] utils....... ..................[OKAY] -[YES] ...... [OKAY] -utilsquantizer ................................ [YES][NO] ............. [OKAY][OKAY] - ---------------------------------------------------quantizer - .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. -async_io ............... [NO] ....... [NO] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -transformer_inference .. [NO] ....... [OKAY] -utils .................. [YES] ...... [OKAY] -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version torch cuda version............... DeepSpeed general environment info: ...............11.1 -11.1 -nvcc version - nvcc version..................... .....................torch install path11.2 -11.2 -deepspeed install path............... deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -deepspeed info -deepspeed info ......................................torch version 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science.................... - - deepspeed wheel compiled w.1.8.1deepspeed wheel compiled w. - ............ torch 1.8, cuda 11.1torch cuda versiontorch 1.8, cuda 11.1 - -............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch cuda versiontorch cuda version .............................. 11.111.1 - -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_io ............... [NO] async_io....... ...............[NO] -[NO] ....... [NO] -transformer_inference .. [NO] transformer_inference....... ..[OKAY] -[NO] ....... [OKAY] -utils .................. [YES] utils...... ..................[OKAY] -[YES] ...... [OKAY]quantizer - .............. [NO] ....... [OKAY] -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -quantizer .............. [NO] .......-------------------------------------------------- -[OKAY] --------------------------------------------------- -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -DeepSpeed general environment info:torch cuda version -............... 11.1 -nvcc versiontorch install path ..................... ...............11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -deepspeed info ...................torch version 0.4.2+bc17042, bc17042, big-science.................... - deepspeed wheel compiled w.1.8.1 -...... torch 1.8, cuda 11.1torch cuda version - ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -DeepSpeed general environment info: -transformer_inferencetransformer_inference .... [NO][NO] .............. [OKAY][OKAY] - -utils ..................utils [YES].................. ......[YES] [OKAY]...... - [OKAY] -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -quantizer .............. [NO]quantizer ..................... [OKAY][NO] - ....... [OKAY] --------------------------------------------------- --------------------------------------------------- -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found - [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. [WARNING]  async_io requires the libraries: ['libaio-dev'] but are missing. Can be fixed by: `apt install libaio-dev`. - -async_ioasync_io .............................. [NO][NO] .............. [NO][NO] - -transformer_inference .. [NO] ....... [OKAY]transformer_inference - .. [NO] ....... [OKAY]utils - .................. [YES] ...... [OKAY] -utils quantizer ................................ [YES][NO] ............. [OKAY] -[OKAY] --------------------------------------------------- -quantizer .............. [NO] ....... [OKAY] --------------------------------------------------- -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -/bin/sh: line 0: type: git: not found -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2DeepSpeed general environment info: -deepspeed install path -........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -torch install pathdeepspeed info .................................. 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 -11.1 -nvcc version nvcc version..................... .....................11.2 -11.2 -deepspeed install path deepspeed install path........... ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']deepspeed info - ...................deepspeed info 0.4.2+bc17042, bc17042, big-science................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed wheel compiled w. - ......deepspeed wheel compiled w. torch 1.8, cuda 11.1...... - torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -DeepSpeed general environment info:torch install path - ............... torch install path ...............['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']1.8.1 - -torch cuda versiontorch version ................................... 11.1DeepSpeed general environment info:1.8.1 - -nvcc version -.....................torch cuda version 11.2............... -torch install path deepspeed install path 11.1 ............... -........... nvcc version ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']..................... - 11.2deepspeed info - deepspeed install path...................['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -...........0.4.2+bc17042, bc17042, big-science -torch versiondeepspeed wheel compiled w.['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -.......................... deepspeed info 1.8.1 torch 1.8, cuda 11.1 -................... - torch cuda version0.4.2+bc17042, bc17042, big-science -...............deepspeed wheel compiled w. 11.1...... - nvcc versiontorch 1.8, cuda 11.1 -..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda version torch cuda version............... ...............11.1 11.1 - -nvcc versionnvcc version ..................... .....................11.2 -11.2 -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path ............... ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch version torch version.................... ....................1.8.1 -1.8.1 -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -/bin/sh: line 0: type: git: not found -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -/bin/sh: line 0: type: git: not found -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version ............... ...............11.1 -11.1 -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ...................... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] - -deepspeed infodeepspeed info ...................................... 0.4.2+bc17042, bc17042, big-science0.4.2+bc17042, bc17042, big-science - -deepspeed wheel compiled w.deepspeed wheel compiled w. ............ torch 1.8, cuda 11.1torch 1.8, cuda 11.1 - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1DeepSpeed general environment info: -nvcc version ..................... - 11.2 -deepspeed install path ...........torch install path ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']............... - deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w.['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -...... torch 1.8, cuda 11.1torch version - .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -DeepSpeed general environment info: -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 - > padded vocab (size: 50257) with 431 dummy tokens (new size: 50688) -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -DeepSpeed general environment info:DeepSpeed general environment info: - -torch install pathtorch install path .............................. ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch']['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] - -torch versiontorch version ........................................ 1.8.11.8.1 - -torch cuda versiontorch cuda version .............................. 11.111.1 - -nvcc versionnvcc version .......................................... 11.211.2 - -deepspeed install pathdeepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']........... - deepspeed info ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed']................... - 0.4.2+bc17042, bc17042, big-sciencedeepspeed info - deepspeed wheel compiled w.................... ...... 0.4.2+bc17042, bc17042, big-sciencetorch 1.8, cuda 11.1 - -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -DeepSpeed general environment info: -torch install path ............... ['/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch'] -torch version .................... 1.8.1 -torch cuda version ............... 11.1 -nvcc version ..................... 11.2 -deepspeed install path ........... ['/gpfsssd/worksf/projects/rech/six/commun/code/tr1-13B/DeepSpeed-big-science/deepspeed'] -deepspeed info ................... 0.4.2+bc17042, bc17042, big-science -deepspeed wheel compiled w. ...... torch 1.8, cuda 11.1 -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown ******** Git info for Megatron: git_hash=unknown git_branch=unknown **** - -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -/bin/sh: line 0: type: git: not found -/bin/sh: line 0: type: git: not found -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -**** Git info for Megatron: git_hash=unknown git_branch=unknown **** -> setting codecarbon ... -> setting tensorboard ... -> initializing torch distributed ... -> initializing tensor model parallel with size 4 -> initializing pipeline model parallel with size 8 -> setting random seeds to 43 ... -[2021-09-30 03:52:24,170] [INFO] [checkpointing.py:226:model_parallel_cuda_manual_seed] > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 2761 and data parallel seed: 43 -> compiling dataset index builder ... -make: Entering directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' -make: Nothing to be done for 'default'. -make: Leaving directory '/gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/data' ->>> done with dataset index builder. Compilation time: 0.302 seconds -> compiling and loading fused kernels ... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_upper_triang_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_upper_triang_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module scaled_masked_softmax_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module scaled_masked_softmax_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -Detected CUDA files, patching ldflags -Emitting ninja build file /gpfsssd/worksf/projects/rech/six/commun/code/tr8-104B/Megatron-DeepSpeed-tr8-104B/megatron/fused_kernels/build/build.ninja... -Building extension module fused_mix_prec_layer_norm_cuda... -Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) -ninja: no work to do. -Loading extension module fused_mix_prec_layer_norm_cuda... -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/utils/cpp_extension.py:283: UserWarning: - - !! WARNING !! - -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -Your compiler (c++) is not compatible with the compiler Pytorch was -built with for this platform, which is g++ on linux. Please -use g++ to to compile your extension. Alternatively, you may -compile PyTorch from source using c++, and then you can also use -c++ to compile your extension. - -See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help -with compiling PyTorch from source. -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - - !! WARNING !! - - warnings.warn(WRONG_COMPILER_WARNING.format( ->>> done with compiling and loading fused kernels. Compilation time: 21.604 seconds -time to initialize megatron (seconds): -17.820 -[after megatron is initialized] datetime: 2021-09-30 03:52:46 -building GPT model ... -[2021-09-30 03:52:46,250] [INFO] [utils.py:680:see_memory_usage] Before Building Model -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:373: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved - warnings.warn( -/gpfswork/rech/six/commun/conda/tr1-13B/lib/python3.8/site-packages/torch/cuda/memory.py:381: FutureWarning: torch.cuda.max_memory_cached has been renamed to torch.cuda.max_memory_reserved - warnings.warn( -[2021-09-30 03:52:46,252] [INFO] [utils.py:681:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB -[2021-09-30 03:52:46,252] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 38.22 GB, percent = 20.4% -SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None -Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=0, model=1): 1, ProcessCoord(pipe=0, data=0, model=2): 2, ProcessCoord(pipe=0, data=0, model=3): 3, ProcessCoord(pipe=0, data=1, model=0): 4, ProcessCoord(pipe=0, data=1, model=1): 5, ProcessCoord(pipe=0, data=1, model=2): 6, ProcessCoord(pipe=0, data=1, model=3): 7, ProcessCoord(pipe=0, data=2, model=0): 8, ProcessCoord(pipe=0, data=2, model=1): 9, ProcessCoord(pipe=0, data=2, model=2): 10, ProcessCoord(pipe=0, data=2, model=3): 11, ProcessCoord(pipe=0, data=3, model=0): 12, ProcessCoord(pipe=0, data=3, model=1): 13, ProcessCoord(pipe=0, data=3, model=2): 14, ProcessCoord(pipe=0, data=3, model=3): 15, ProcessCoord(pipe=0, data=4, model=0): 16, ProcessCoord(pipe=0, data=4, model=1): 17, ProcessCoord(pipe=0, data=4, model=2): 18, ProcessCoord(pipe=0, data=4, model=3): 19, ProcessCoord(pipe=0, data=5, model=0): 20, ProcessCoord(pipe=0, data=5, model=1): 21, ProcessCoord(pipe=0, data=5, model=2): 22, ProcessCoord(pipe=0, data=5, model=3): 23, ProcessCoord(pipe=0, data=6, model=0): 24, ProcessCoord(pipe=0, data=6, model=1): 25, ProcessCoord(pipe=0, data=6, model=2): 26, ProcessCoord(pipe=0, data=6, model=3): 27, ProcessCoord(pipe=0, data=7, model=0): 28, ProcessCoord(pipe=0, data=7, model=1): 29, ProcessCoord(pipe=0, data=7, model=2): 30, ProcessCoord(pipe=0, data=7, model=3): 31, ProcessCoord(pipe=1, data=0, model=0): 32, ProcessCoord(pipe=1, data=0, model=1): 33, ProcessCoord(pipe=1, data=0, model=2): 34, ProcessCoord(pipe=1, data=0, model=3): 35, ProcessCoord(pipe=1, data=1, model=0): 36, ProcessCoord(pipe=1, data=1, model=1): 37, ProcessCoord(pipe=1, data=1, model=2): 38, ProcessCoord(pipe=1, data=1, model=3): 39, ProcessCoord(pipe=1, data=2, model=0): 40, ProcessCoord(pipe=1, data=2, model=1): 41, ProcessCoord(pipe=1, data=2, model=2): 42, ProcessCoord(pipe=1, data=2, model=3): 43, ProcessCoord(pipe=1, data=3, model=0): 44, ProcessCoord(pipe=1, data=3, model=1): 45, ProcessCoord(pipe=1, data=3, model=2): 46, ProcessCoord(pipe=1, data=3, model=3): 47, ProcessCoord(pipe=1, data=4, model=0): 48, ProcessCoord(pipe=1, data=4, model=1): 49, ProcessCoord(pipe=1, data=4, model=2): 50, ProcessCoord(pipe=1, data=4, model=3): 51, ProcessCoord(pipe=1, data=5, model=0): 52, ProcessCoord(pipe=1, data=5, model=1): 53, ProcessCoord(pipe=1, data=5, model=2): 54, ProcessCoord(pipe=1, data=5, model=3): 55, ProcessCoord(pipe=1, data=6, model=0): 56, ProcessCoord(pipe=1, data=6, model=1): 57, ProcessCoord(pipe=1, data=6, model=2): 58, ProcessCoord(pipe=1, data=6, model=3): 59, ProcessCoord(pipe=1, data=7, model=0): 60, ProcessCoord(pipe=1, data=7, model=1): 61, ProcessCoord(pipe=1, data=7, model=2): 62, ProcessCoord(pipe=1, data=7, model=3): 63, ProcessCoord(pipe=2, data=0, model=0): 64, ProcessCoord(pipe=2, data=0, model=1): 65, ProcessCoord(pipe=2, data=0, model=2): 66, ProcessCoord(pipe=2, data=0, model=3): 67, ProcessCoord(pipe=2, data=1, model=0): 68, ProcessCoord(pipe=2, data=1, model=1): 69, ProcessCoord(pipe=2, data=1, model=2): 70, ProcessCoord(pipe=2, data=1, model=3): 71, ProcessCoord(pipe=2, data=2, model=0): 72, ProcessCoord(pipe=2, data=2, model=1): 73, ProcessCoord(pipe=2, data=2, model=2): 74, ProcessCoord(pipe=2, data=2, model=3): 75, ProcessCoord(pipe=2, data=3, model=0): 76, ProcessCoord(pipe=2, data=3, model=1): 77, ProcessCoord(pipe=2, data=3, model=2): 78, ProcessCoord(pipe=2, data=3, model=3): 79, ProcessCoord(pipe=2, data=4, model=0): 80, ProcessCoord(pipe=2, data=4, model=1): 81, ProcessCoord(pipe=2, data=4, model=2): 82, ProcessCoord(pipe=2, data=4, model=3): 83, ProcessCoord(pipe=2, data=5, model=0): 84, ProcessCoord(pipe=2, data=5, model=1): 85, ProcessCoord(pipe=2, data=5, model=2): 86, ProcessCoord(pipe=2, data=5, model=3): 87, ProcessCoord(pipe=2, data=6, model=0): 88, ProcessCoord(pipe=2, data=6, model=1): 89, ProcessCoord(pipe=2, data=6, model=2): 90, ProcessCoord(pipe=2, data=6, model=3): 91, ProcessCoord(pipe=2, data=7, model=0): 92, ProcessCoord(pipe=2, data=7, model=1): 93, ProcessCoord(pipe=2, data=7, model=2): 94, ProcessCoord(pipe=2, data=7, model=3): 95, ProcessCoord(pipe=3, data=0, model=0): 96, ProcessCoord(pipe=3, data=0, model=1): 97, ProcessCoord(pipe=3, data=0, model=2): 98, ProcessCoord(pipe=3, data=0, model=3): 99, ProcessCoord(pipe=3, data=1, model=0): 100, ProcessCoord(pipe=3, data=1, model=1): 101, ProcessCoord(pipe=3, data=1, model=2): 102, ProcessCoord(pipe=3, data=1, model=3): 103, ProcessCoord(pipe=3, data=2, model=0): 104, ProcessCoord(pipe=3, data=2, model=1): 105, ProcessCoord(pipe=3, data=2, model=2): 106, ProcessCoord(pipe=3, data=2, model=3): 107, ProcessCoord(pipe=3, data=3, model=0): 108, ProcessCoord(pipe=3, data=3, model=1): 109, ProcessCoord(pipe=3, data=3, model=2): 110, ProcessCoord(pipe=3, data=3, model=3): 111, ProcessCoord(pipe=3, data=4, model=0): 112, ProcessCoord(pipe=3, data=4, model=1): 113, ProcessCoord(pipe=3, data=4, model=2): 114, ProcessCoord(pipe=3, data=4, model=3): 115, ProcessCoord(pipe=3, data=5, model=0): 116, ProcessCoord(pipe=3, data=5, model=1): 117, ProcessCoord(pipe=3, data=5, model=2): 118, ProcessCoord(pipe=3, data=5, model=3): 119, ProcessCoord(pipe=3, data=6, model=0): 120, ProcessCoord(pipe=3, data=6, model=1): 121, ProcessCoord(pipe=3, data=6, model=2): 122, ProcessCoord(pipe=3, data=6, model=3): 123, ProcessCoord(pipe=3, data=7, model=0): 124, ProcessCoord(pipe=3, data=7, model=1): 125, ProcessCoord(pipe=3, data=7, model=2): 126, ProcessCoord(pipe=3, data=7, model=3): 127, ProcessCoord(pipe=4, data=0, model=0): 128, ProcessCoord(pipe=4, data=0, model=1): 129, ProcessCoord(pipe=4, data=0, model=2): 130, ProcessCoord(pipe=4, data=0, model=3): 131, ProcessCoord(pipe=4, data=1, model=0): 132, ProcessCoord(pipe=4, data=1, model=1): 133, ProcessCoord(pipe=4, data=1, model=2): 134, ProcessCoord(pipe=4, data=1, model=3): 135, ProcessCoord(pipe=4, data=2, model=0): 136, ProcessCoord(pipe=4, data=2, model=1): 137, ProcessCoord(pipe=4, data=2, model=2): 138, ProcessCoord(pipe=4, data=2, model=3): 139, ProcessCoord(pipe=4, data=3, model=0): 140, ProcessCoord(pipe=4, data=3, model=1): 141, ProcessCoord(pipe=4, data=3, model=2): 142, ProcessCoord(pipe=4, data=3, model=3): 143, ProcessCoord(pipe=4, data=4, model=0): 144, ProcessCoord(pipe=4, data=4, model=1): 145, ProcessCoord(pipe=4, data=4, model=2): 146, ProcessCoord(pipe=4, data=4, model=3): 147, ProcessCoord(pipe=4, data=5, model=0): 148, ProcessCoord(pipe=4, data=5, model=1): 149, ProcessCoord(pipe=4, data=5, model=2): 150, ProcessCoord(pipe=4, data=5, model=3): 151, ProcessCoord(pipe=4, data=6, model=0): 152, ProcessCoord(pipe=4, data=6, model=1): 153, ProcessCoord(pipe=4, data=6, model=2): 154, ProcessCoord(pipe=4, data=6, model=3): 155, ProcessCoord(pipe=4, data=7, model=0): 156, ProcessCoord(pipe=4, data=7, model=1): 157, ProcessCoord(pipe=4, data=7, model=2): 158, ProcessCoord(pipe=4, data=7, model=3): 159, ProcessCoord(pipe=5, data=0, model=0): 160, ProcessCoord(pipe=5, data=0, model=1): 161, ProcessCoord(pipe=5, data=0, model=2): 162, ProcessCoord(pipe=5, data=0, model=3): 163, ProcessCoord(pipe=5, data=1, model=0): 164, ProcessCoord(pipe=5, data=1, model=1): 165, ProcessCoord(pipe=5, data=1, model=2): 166, ProcessCoord(pipe=5, data=1, model=3): 167, ProcessCoord(pipe=5, data=2, model=0): 168, ProcessCoord(pipe=5, data=2, model=1): 169, ProcessCoord(pipe=5, data=2, model=2): 170, ProcessCoord(pipe=5, data=2, model=3): 171, ProcessCoord(pipe=5, data=3, model=0): 172, ProcessCoord(pipe=5, data=3, model=1): 173, ProcessCoord(pipe=5, data=3, model=2): 174, ProcessCoord(pipe=5, data=3, model=3): 175, ProcessCoord(pipe=5, data=4, model=0): 176, ProcessCoord(pipe=5, data=4, model=1): 177, ProcessCoord(pipe=5, data=4, model=2): 178, ProcessCoord(pipe=5, data=4, model=3): 179, ProcessCoord(pipe=5, data=5, model=0): 180, ProcessCoord(pipe=5, data=5, model=1): 181, ProcessCoord(pipe=5, data=5, model=2): 182, ProcessCoord(pipe=5, data=5, model=3): 183, ProcessCoord(pipe=5, data=6, model=0): 184, ProcessCoord(pipe=5, data=6, model=1): 185, ProcessCoord(pipe=5, data=6, model=2): 186, ProcessCoord(pipe=5, data=6, model=3): 187, ProcessCoord(pipe=5, data=7, model=0): 188, ProcessCoord(pipe=5, data=7, model=1): 189, ProcessCoord(pipe=5, data=7, model=2): 190, ProcessCoord(pipe=5, data=7, model=3): 191, ProcessCoord(pipe=6, data=0, model=0): 192, ProcessCoord(pipe=6, data=0, model=1): 193, ProcessCoord(pipe=6, data=0, model=2): 194, ProcessCoord(pipe=6, data=0, model=3): 195, ProcessCoord(pipe=6, data=1, model=0): 196, ProcessCoord(pipe=6, data=1, model=1): 197, ProcessCoord(pipe=6, data=1, model=2): 198, ProcessCoord(pipe=6, data=1, model=3): 199, ProcessCoord(pipe=6, data=2, model=0): 200, ProcessCoord(pipe=6, data=2, model=1): 201, ProcessCoord(pipe=6, data=2, model=2): 202, ProcessCoord(pipe=6, data=2, model=3): 203, ProcessCoord(pipe=6, data=3, model=0): 204, ProcessCoord(pipe=6, data=3, model=1): 205, ProcessCoord(pipe=6, data=3, model=2): 206, ProcessCoord(pipe=6, data=3, model=3): 207, ProcessCoord(pipe=6, data=4, model=0): 208, ProcessCoord(pipe=6, data=4, model=1): 209, ProcessCoord(pipe=6, data=4, model=2): 210, ProcessCoord(pipe=6, data=4, model=3): 211, ProcessCoord(pipe=6, data=5, model=0): 212, ProcessCoord(pipe=6, data=5, model=1): 213, ProcessCoord(pipe=6, data=5, model=2): 214, ProcessCoord(pipe=6, data=5, model=3): 215, ProcessCoord(pipe=6, data=6, model=0): 216, ProcessCoord(pipe=6, data=6, model=1): 217, ProcessCoord(pipe=6, data=6, model=2): 218, ProcessCoord(pipe=6, data=6, model=3): 219, ProcessCoord(pipe=6, data=7, model=0): 220, ProcessCoord(pipe=6, data=7, model=1): 221, ProcessCoord(pipe=6, data=7, model=2): 222, ProcessCoord(pipe=6, data=7, model=3): 223, ProcessCoord(pipe=7, data=0, model=0): 224, ProcessCoord(pipe=7, data=0, model=1): 225, ProcessCoord(pipe=7, data=0, model=2): 226, ProcessCoord(pipe=7, data=0, model=3): 227, ProcessCoord(pipe=7, data=1, model=0): 228, ProcessCoord(pipe=7, data=1, model=1): 229, ProcessCoord(pipe=7, data=1, model=2): 230, ProcessCoord(pipe=7, data=1, model=3): 231, ProcessCoord(pipe=7, data=2, model=0): 232, ProcessCoord(pipe=7, data=2, model=1): 233, ProcessCoord(pipe=7, data=2, model=2): 234, ProcessCoord(pipe=7, data=2, model=3): 235, ProcessCoord(pipe=7, data=3, model=0): 236, ProcessCoord(pipe=7, data=3, model=1): 237, ProcessCoord(pipe=7, data=3, model=2): 238, ProcessCoord(pipe=7, data=3, model=3): 239, ProcessCoord(pipe=7, data=4, model=0): 240, ProcessCoord(pipe=7, data=4, model=1): 241, ProcessCoord(pipe=7, data=4, model=2): 242, ProcessCoord(pipe=7, data=4, model=3): 243, ProcessCoord(pipe=7, data=5, model=0): 244, ProcessCoord(pipe=7, data=5, model=1): 245, ProcessCoord(pipe=7, data=5, model=2): 246, ProcessCoord(pipe=7, data=5, model=3): 247, ProcessCoord(pipe=7, data=6, model=0): 248, ProcessCoord(pipe=7, data=6, model=1): 249, ProcessCoord(pipe=7, data=6, model=2): 250, ProcessCoord(pipe=7, data=6, model=3): 251, ProcessCoord(pipe=7, data=7, model=0): 252, ProcessCoord(pipe=7, data=7, model=1): 253, ProcessCoord(pipe=7, data=7, model=2): 254, ProcessCoord(pipe=7, data=7, model=3): 255} -[2021-09-30 03:52:47,659] [INFO] [module.py:360:_partition_layers] Partitioning pipeline stages with method type:transformer -stage=0 layers=7 - 0: _to_float16 - 1: EmbeddingPipe - 2: - 3: ParallelTransformerLayerPipe - 4: ParallelTransformerLayerPipe - 5: ParallelTransformerLayerPipe - 6: ParallelTransformerLayerPipe -stage=1 layers=4 - 7: ParallelTransformerLayerPipe - 8: ParallelTransformerLayerPipe - 9: ParallelTransformerLayerPipe - 10: ParallelTransformerLayerPipe -stage=2 layers=4 - 11: ParallelTransformerLayerPipe - 12: ParallelTransformerLayerPipe - 13: ParallelTransformerLayerPipe - 14: ParallelTransformerLayerPipe -stage=3 layers=4 - 15: ParallelTransformerLayerPipe - 16: ParallelTransformerLayerPipe - 17: ParallelTransformerLayerPipe - 18: ParallelTransformerLayerPipe -stage=4 layers=4 - 19: ParallelTransformerLayerPipe - 20: ParallelTransformerLayerPipe - 21: ParallelTransformerLayerPipe - 22: ParallelTransformerLayerPipe -stage=5 layers=4 - 23: ParallelTransformerLayerPipe - 24: ParallelTransformerLayerPipe - 25: ParallelTransformerLayerPipe - 26: ParallelTransformerLayerPipe -stage=6 layers=4 - 27: ParallelTransformerLayerPipe - 28: ParallelTransformerLayerPipe - 29: ParallelTransformerLayerPipe - 30: ParallelTransformerLayerPipe -stage=7 layers=8 - 31: ParallelTransformerLayerPipe - 32: ParallelTransformerLayerPipe - 33: ParallelTransformerLayerPipe - 34: ParallelTransformerLayerPipe - 35: - 36: MixedFusedLayerNorm - 37: EmbeddingPipe - 38: float16_to_fp32 - loss: CrossEntropy - > number of parameters on (tensor, pipeline) model parallel rank (0, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 4): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 1): 1745293312 > number of parameters on (tensor, pipeline) model parallel rank (3, 1): 1745293312 - - > number of parameters on (tensor, pipeline) model parallel rank (2, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 1): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 3): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 2): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 5): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (0, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (3, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (2, 6): 1745293312 - > number of parameters on (tensor, pipeline) model parallel rank (1, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (1, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (0, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (2, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (3, 7): 1986498560 - > number of parameters on (tensor, pipeline) model parallel rank (3, 0): 1986465792 - > number of parameters on (tensor, pipeline) model parallel rank (2, 7): 1986498560 -[2021-09-30 03:52:48,902] [INFO] [utils.py:680:see_memory_usage] After Building Model -[2021-09-30 03:52:48,903] [INFO] [utils.py:681:see_memory_usage] MA 3.77 GB Max_MA 3.79 GB CA 3.79 GB Max_CA 4 GB -[2021-09-30 03:52:48,903] [INFO] [utils.py:689:see_memory_usage] CPU Virtual Memory: used = 38.4 GB, percent = 20.5% - > number of parameters on (tensor, pipeline) model parallel rank (0, 0): 1986465792 -setting training iterations to 159576 -> learning rate decay style: cosine -DeepSpeed is enabled. -[2021-09-30 03:52:48,923] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.4.2+bc17042, git-hash=bc17042, git-branch=big-science -[2021-09-30 03:52:49,069] [INFO] [engine.py:179:__init__] DeepSpeed Flops Profiler Enabled: False -[2021-09-30 03:52:49,069] [INFO] [engine.py:736:_configure_optimizer] Removing param_group that has no 'params' in the client Optimizer -[2021-09-30 03:52:49,069] [INFO] [engine.py:741:_configure_optimizer] Using client Optimizer as basic optimizer -[2021-09-30 03:52:49,069] [INFO] [engine.py:750:_configure_optimizer] DeepSpeed Basic Optimizer = FusedAdam -[2021-09-30 03:52:49,069] [INFO] [utils.py:43:is_zero_supported_optimizer] Checking ZeRO support for optimizer=FusedAdam type= -[2021-09-30 03:52:49,069] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 1 optimizer -[2021-09-30 03:52:49,070] [INFO] [stage2.py:106:__init__] Reduce bucket size 500000000 -[2021-09-30 03:52:49,070] [INFO] [stage2.py:107:__init__] Allgather bucket size 500000000 -[2021-09-30 03:52:49,070] [INFO] [stage2.py:108:__init__] CPU Offload: False -[2021-09-30 03:52:49,070] [INFO] [stage2.py:109:__init__] Round robin gradient partitioning: False -[2021-09-30 03:52:53,500] [INFO] [stage2.py:419:__init__] optimizer state initialized -[2021-09-30 03:52:53,500] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam -[2021-09-30 03:52:53,500] [INFO] [engine.py:553:_configure_lr_scheduler] DeepSpeed using client LR scheduler -[2021-09-30 03:52:53,500] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = -[2021-09-30 03:52:53,500] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999)] -[2021-09-30 03:52:53,500] [INFO] [config.py:900:print] DeepSpeedEngine configuration: -[2021-09-30 03:52:53,500] [INFO] [config.py:904:print] activation_checkpointing_config { - "partition_activations": false, - "contiguous_memory_optimization": false, - "cpu_checkpointing": false, - "number_checkpoints": null, - "synchronize_checkpoint_boundary": false, - "profile": false -} -[2021-09-30 03:52:53,500] [INFO] [config.py:904:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} -[2021-09-30 03:52:53,500] [INFO] [config.py:904:print] allreduce_always_fp32 ........ False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] amp_enabled .................. False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] amp_params ................... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] checkpoint_tag_validation_enabled True -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] checkpoint_tag_validation_fail False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] disable_allgather ............ False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] dump_state ................... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 500, 'delayed_shift': 2, 'min_scale': 1} -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] eigenvalue_enabled ........... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] eigenvalue_gas_boundary_resolution 1 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] eigenvalue_layer_name ........ bert.encoder.layer -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] eigenvalue_layer_num ......... 0 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] eigenvalue_max_iter .......... 100 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] eigenvalue_stability ......... 1e-06 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] eigenvalue_tol ............... 0.01 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] eigenvalue_verbose ........... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] elasticity_enabled ........... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] flops_profiler_config ........ { - "enabled": false, - "profile_step": 1, - "module_depth": -1, - "top_modules": 1, - "detailed": true, - "output_file": null -} -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] fp16_enabled ................. True -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] fp16_mixed_quantize .......... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] global_rank .................. 0 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] gradient_accumulation_steps .. 256 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] gradient_clipping ............ 1.0 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] gradient_predivide_factor .... 1.0 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] initial_dynamic_scale ........ 4096 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] loss_scale ................... 0 -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] memory_breakdown ............. False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] optimizer_legacy_fusion ...... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] optimizer_name ............... None -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] optimizer_params ............. None -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] pld_enabled .................. False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] pld_params ................... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] prescale_gradients ........... False -[2021-09-30 03:52:53,501] [INFO] [config.py:904:print] quantize_change_rate ......... 0.001 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_groups .............. 1 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_offset .............. 1000 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_period .............. 1000 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_rounding ............ 0 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_start_bits .......... 16 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_target_bits ......... 8 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_training_enabled .... False -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_type ................ 0 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] quantize_verbose ............. False -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] scheduler_name ............... None -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] scheduler_params ............. None -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] sparse_attention ............. None -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] sparse_gradients_enabled ..... False -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] steps_per_print .............. 2000 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] tensorboard_enabled .......... False -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] tensorboard_job_name ......... DeepSpeedJobName -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] tensorboard_output_path ...... -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] train_batch_size ............. 2048 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] train_micro_batch_size_per_gpu 1 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] use_quantizer_kernel ......... False -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] wall_clock_breakdown ......... False -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] world_size ................... 8 -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] zero_allow_untested_optimizer False -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] zero_config .................. { - "stage": 1, - "contiguous_gradients": false, - "reduce_scatter": true, - "reduce_bucket_size": 5.000000e+08, - "allgather_partitions": true, - "allgather_bucket_size": 5.000000e+08, - "overlap_comm": false, - "load_from_fp32_weights": true, - "elastic_checkpoint": true, - "offload_param": null, - "offload_optimizer": null, - "sub_group_size": 1.000000e+09, - "prefetch_bucket_size": 5.000000e+07, - "param_persistence_threshold": 1.000000e+05, - "max_live_parameters": 1.000000e+09, - "max_reuse_distance": 1.000000e+09, - "gather_fp16_weights_on_model_save": false, - "ignore_unused_parameters": true, - "round_robin_gradients": false, - "legacy_stage1": false -} -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] zero_enabled ................. True -[2021-09-30 03:52:53,502] [INFO] [config.py:904:print] zero_optimization_stage ...... 1 -[2021-09-30 03:52:53,502] [INFO] [config.py:906:print] json = { - "train_micro_batch_size_per_gpu": 1, - "train_batch_size": 2.048000e+03, - "gradient_clipping": 1.0, - "zero_optimization": { - "stage": 1 - }, - "fp16": { - "enabled": true, - "loss_scale": 0, - "loss_scale_window": 500, - "hysteresis": 2, - "min_loss_scale": 1, - "initial_scale_power": 12 - }, - "steps_per_print": 2.000000e+03, - "wall_clock_breakdown": false -} -[2021-09-30 03:52:53,503] [INFO] [engine.py:76:__init__] CONFIG: micro_batches=256 micro_batch_size=1 -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=0 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=2 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=3 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=1 STAGE=0 LAYERS=7 [0, 7) STAGE_PARAMS=1986465792 (1986.466M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=129 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=128 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=130 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=131 STAGE=4 LAYERS=4 [19, 23) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=66 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=67 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=64 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=225 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=226 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=99 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=98 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=96 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=193 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=194 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=195 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=192 STAGE=6 LAYERS=4 [27, 31) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=65 STAGE=2 LAYERS=4 [11, 15) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=35 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=34 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=32 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=33 STAGE=1 LAYERS=4 [7, 11) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=160 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=162 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=161 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=163 STAGE=5 LAYERS=4 [23, 27) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=224 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=97 STAGE=3 LAYERS=4 [15, 19) STAGE_PARAMS=1745293312 (1745.293M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) -[2021-09-30 03:52:54,122] [INFO] [engine.py:134:__init__] RANK=227 STAGE=7 LAYERS=8 [31, 39) STAGE_PARAMS=1986498560 (1986.499M) TOTAL_PARAMS=57778896896 (57778.897M) UNIQUE_PARAMS=56814206976 (56814.207M) - > using checkpoint value 6e-05 for learning rate - > using checkpoint value 6e-06 for minimum learning rate - > using checkpoint value 216320 for warmup iterations - > using checkpoint value 126953125 for total number of iterations - > using checkpoint value cosine for decay style -successfully loaded 8 ZeRO state_dicts for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 41 -successfully loaded 8 ZeRO state_dicts for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 42 -successfully loaded 8 ZeRO state_dicts for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 200 -successfully loaded 8 ZeRO state_dicts for rank 158 -successfully loaded 8 ZeRO state_dicts for rank 43 -successfully loaded 8 ZeRO state_dicts for rank 53 -successfully loaded 8 ZeRO state_dicts for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 203 -successfully loaded 8 ZeRO state_dicts for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 75 -successfully loaded 8 ZeRO state_dicts for rank 84 -successfully loaded 8 ZeRO state_dicts for rank 80 -successfully loaded 8 ZeRO state_dicts for rank 201 -successfully loaded 8 ZeRO state_dicts for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 115 -successfully loaded 8 ZeRO state_dicts for rank 202 -successfully loaded 8 ZeRO state_dicts for rank 106 -successfully loaded 8 ZeRO state_dicts for rank 174 -successfully loaded 8 ZeRO state_dicts for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 100 -successfully loaded 8 ZeRO state_dicts for rank 216 -successfully loaded 8 ZeRO state_dicts for rank 196 -successfully loaded 8 ZeRO state_dicts for rank 34 -successfully loaded 8 ZeRO state_dicts for rank 138 -successfully loaded 8 ZeRO state_dicts for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 142 -successfully loaded 8 ZeRO state_dicts for rank 83 -successfully loaded 8 ZeRO state_dicts for rank 130 -successfully loaded 8 ZeRO state_dicts for rank 199 -successfully loaded 8 ZeRO state_dicts for rank 38 -successfully loaded 8 ZeRO state_dicts for rank 117 -successfully loaded 8 ZeRO state_dicts for rank 112 -successfully loaded 8 ZeRO state_dicts for rank 54 -successfully loaded 8 ZeRO state_dicts for rank 134 -successfully loaded 8 ZeRO state_dicts for rank 55 -successfully loaded 8 ZeRO state_dicts for rank 213 -successfully loaded 8 ZeRO state_dicts for rank 96 -successfully loaded 8 ZeRO state_dicts for rank 90 -successfully loaded 8 ZeRO state_dicts for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 60 -successfully loaded 8 ZeRO state_dicts for rank 211 -successfully loaded 8 ZeRO state_dicts for rank 171 -successfully loaded 8 ZeRO state_dicts for rank 81 -successfully loaded 8 ZeRO state_dicts for rank 73 -loading 8 zero partition checkpoints for rank 156 -successfully loaded 8 ZeRO state_dicts for rank 69 -successfully loaded 8 ZeRO state_dicts for rank 70 -successfully loaded 8 ZeRO state_dicts for rank 124 -successfully loaded 8 ZeRO state_dicts for rank 195 -successfully loaded 8 ZeRO state_dicts for rank 170 -successfully loaded 8 ZeRO state_dicts for rank 36 -successfully loaded 8 ZeRO state_dicts for rank 103 -successfully loaded 8 ZeRO state_dicts for rank 169 -successfully loaded 8 ZeRO state_dicts for rank 219 -successfully loaded 8 ZeRO state_dicts for rank 208 -loading 8 zero partition checkpoints for rank 40 -successfully loaded 8 ZeRO state_dicts for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 51 -successfully loaded 8 ZeRO state_dicts for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 141 -successfully loaded 8 ZeRO state_dicts for rank 107 -successfully loaded 8 ZeRO state_dicts for rank 120 -successfully loaded 8 ZeRO state_dicts for rank 132 -successfully loaded 8 ZeRO state_dicts for rank 95 -successfully loaded 8 ZeRO state_dicts for rank 58 -successfully loaded 8 ZeRO state_dicts for rank 50 -loading 8 zero partition checkpoints for rank 41 -successfully loaded 8 ZeRO state_dicts for rank 113 -successfully loaded 8 ZeRO state_dicts for rank 139 -successfully loaded 8 ZeRO state_dicts for rank 64 -successfully loaded 8 ZeRO state_dicts for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 109 -successfully loaded 8 ZeRO state_dicts for rank 67 -successfully loaded 8 ZeRO state_dicts for rank 62 -successfully loaded 8 ZeRO state_dicts for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 223 -successfully loaded 8 ZeRO state_dicts for rank 71 -successfully loaded 8 ZeRO state_dicts for rank 44 -successfully loaded 8 ZeRO state_dicts for rank 143 -successfully loaded 8 ZeRO state_dicts for rank 205 -successfully loaded 8 ZeRO state_dicts for rank 56 -successfully loaded 8 ZeRO state_dicts for rank 209 -successfully loaded 8 ZeRO state_dicts for rank 125 -successfully loaded 8 ZeRO state_dicts for rank 175 -successfully loaded 8 ZeRO state_dicts for rank 191 -successfully loaded 8 ZeRO state_dicts for rank 214 -successfully loaded 8 ZeRO state_dicts for rank 46 -successfully loaded 8 ZeRO state_dicts for rank 63 -successfully loaded 8 ZeRO state_dicts for rank 129 -successfully loaded 8 ZeRO state_dicts for rank 131 -successfully loaded 8 ZeRO state_dicts for rank 152 -successfully loaded 8 ZeRO state_dicts for rank 94 -successfully loaded 8 ZeRO state_dicts for rank 65 -successfully loaded 8 ZeRO state_dicts for rank 144 -successfully loaded 8 ZeRO state_dicts for rank 121 -successfully loaded 8 ZeRO state_dicts for rank 57 -successfully loaded 8 ZeRO state_dicts for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 61 -successfully loaded 8 ZeRO state_dicts for rank 218 -successfully loaded 8 ZeRO state_dicts for rank 133 -successfully loaded 8 ZeRO state_dicts for rank 74 -successfully loaded 8 ZeRO state_dicts for rank 72 -successfully loaded 8 ZeRO state_dicts for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 160 -successfully loaded 8 ZeRO state_dicts for rank 101 -successfully loaded 8 ZeRO state_dicts for rank 28 -successfully loaded 8 ZeRO state_dicts for rank 151 -successfully loaded 8 ZeRO state_dicts for rank 188 -successfully loaded 8 ZeRO state_dicts for rank 164 -successfully loaded 8 ZeRO state_dicts for rank 180 -successfully loaded 8 ZeRO state_dicts for rank 111 -successfully loaded 8 ZeRO state_dicts for rank 176 -successfully loaded 8 ZeRO state_dicts for rank 110 -successfully loaded 8 ZeRO state_dicts for rank 198 -successfully loaded 8 ZeRO state_dicts for rank 66 -successfully loaded 8 ZeRO state_dicts for rank 222 -successfully loaded 8 ZeRO state_dicts for rank 184 -successfully loaded 8 ZeRO state_dicts for rank 35 -successfully loaded 8 ZeRO state_dicts for rank 98 -successfully loaded 8 ZeRO state_dicts for rank 210 -successfully loaded 8 ZeRO state_dicts for rank 154 -successfully loaded 8 ZeRO state_dicts for rank 172 -successfully loaded 8 ZeRO state_dicts for rank 178 -successfully loaded 8 ZeRO state_dicts for rank 190 -successfully loaded 8 ZeRO state_dicts for rank 39 -loading 8 zero partition checkpoints for rank 42 -successfully loaded 8 ZeRO state_dicts for rank 146 -successfully loaded 8 ZeRO state_dicts for rank 212 -successfully loaded 8 ZeRO state_dicts for rank 85 -successfully loaded 8 ZeRO state_dicts for rank 91 -successfully loaded 8 ZeRO state_dicts for rank 148 -successfully loaded 8 ZeRO state_dicts for rank 153 -successfully loaded 8 ZeRO state_dicts for rank 114 -successfully loaded 8 ZeRO state_dicts for rank 59 -successfully loaded 8 ZeRO state_dicts for rank 68 -successfully loaded 8 ZeRO state_dicts for rank 123 -successfully loaded 8 ZeRO state_dicts for rank 182 -successfully loaded 8 ZeRO state_dicts for rank 221 -loading 8 zero partition checkpoints for rank 204 -successfully loaded 8 ZeRO state_dicts for rank 99 -successfully loaded 8 ZeRO state_dicts for rank 93 -successfully loaded 8 ZeRO state_dicts for rank 89 -successfully loaded 8 ZeRO state_dicts for rank 77 -loading 8 zero partition checkpoints for rank 52 -successfully loaded 8 ZeRO state_dicts for rank 167 -successfully loaded 8 ZeRO state_dicts for rank 97 -successfully loaded 8 ZeRO state_dicts for rank 179 -successfully loaded 8 ZeRO state_dicts for rank 116 -successfully loaded 8 ZeRO state_dicts for rank 173 -successfully loaded 8 ZeRO state_dicts for rank 147 -successfully loaded 8 ZeRO state_dicts for rank 33 -successfully loaded 8 ZeRO state_dicts for rank 22 -successfully loaded 8 ZeRO state_dicts for rank 2 -successfully loaded 8 ZeRO state_dicts for rank 49 -successfully loaded 8 ZeRO state_dicts for rank 119 -successfully loaded 8 ZeRO state_dicts for rank 37 -successfully loaded 8 ZeRO state_dicts for rank 187 -successfully loaded 8 ZeRO state_dicts for rank 192 -successfully loaded 8 ZeRO state_dicts for rank 155 -successfully loaded 8 ZeRO state_dicts for rank 159 -successfully loaded 8 ZeRO state_dicts for rank 87 -successfully loaded 8 ZeRO state_dicts for rank 92 -loading 8 zero partition checkpoints for rank 168 -successfully loaded 8 ZeRO state_dicts for rank 165 -successfully loaded 8 ZeRO state_dicts for rank 82 -successfully loaded 8 ZeRO state_dicts for rank 161 -successfully loaded 8 ZeRO state_dicts for rank 189 -loading 8 zero partition checkpoints for rank 207 -successfully loaded 8 ZeRO state_dicts for rank 166 -successfully loaded 8 ZeRO state_dicts for rank 76 -successfully loaded 8 ZeRO state_dicts for rank 79 -successfully loaded 8 ZeRO state_dicts for rank 193 -successfully loaded 8 ZeRO state_dicts for rank 26 -successfully loaded 8 ZeRO state_dicts for rank 217 -successfully loaded 8 ZeRO state_dicts for rank 162 -successfully loaded 8 ZeRO state_dicts for rank 181 -successfully loaded 8 ZeRO state_dicts for rank 186 -successfully loaded 8 ZeRO state_dicts for rank 194 -loading 8 zero partition checkpoints for rank 200 -loading 8 zero partition checkpoints for rank 53 -loading 8 zero partition checkpoints for rank 104 -successfully loaded 8 ZeRO state_dicts for rank 20 -successfully loaded 8 ZeRO state_dicts for rank 145 -successfully loaded 8 ZeRO state_dicts for rank 102 -successfully loaded 8 ZeRO state_dicts for rank 86 -successfully loaded 8 ZeRO state_dicts for rank 88 -successfully loaded 8 ZeRO state_dicts for rank 18 -successfully loaded 8 ZeRO state_dicts for rank 163 -loading 8 zero partition checkpoints for rank 140 -successfully loaded 8 ZeRO state_dicts for rank 149 -successfully loaded 8 ZeRO state_dicts for rank 23 -loading 8 zero partition checkpoints for rank 43 -successfully loaded 8 ZeRO state_dicts for rank 78 -successfully loaded 8 ZeRO state_dicts for rank 45 -successfully loaded 8 ZeRO state_dicts for rank 197 -successfully loaded 8 ZeRO state_dicts for rank 122 -successfully loaded 8 ZeRO state_dicts for rank 47 -successfully loaded 8 ZeRO state_dicts for rank 30 -successfully loaded 8 ZeRO state_dicts for rank 126 -loading 8 zero partition checkpoints for rank 206 -successfully loaded 8 ZeRO state_dicts for rank 14 -successfully loaded 8 ZeRO state_dicts for rank 231 -successfully loaded 8 ZeRO state_dicts for rank 31 -successfully loaded 8 ZeRO state_dicts for rank 16 -loading 8 zero partition checkpoints for rank 220 -successfully loaded 8 ZeRO state_dicts for rank 29 -loading 8 zero partition checkpoints for rank 75 -successfully loaded 8 ZeRO state_dicts for rank 185 -successfully loaded 8 ZeRO state_dicts for rank 183 -successfully loaded 8 ZeRO state_dicts for rank 228 -successfully loaded 8 ZeRO state_dicts for rank 118 -successfully loaded 8 ZeRO state_dicts for rank 177 -loading 8 zero partition checkpoints for rank 128 -successfully loaded 8 ZeRO state_dicts for rank 240 -successfully loaded 8 ZeRO state_dicts for rank 12 -successfully loaded 8 ZeRO state_dicts for rank 224 -successfully loaded 8 ZeRO state_dicts for rank 232 -successfully loaded 8 ZeRO state_dicts for rank 248 -successfully loaded 8 ZeRO state_dicts for rank 21 -successfully loaded 8 ZeRO state_dicts for rank 0 -successfully loaded 8 ZeRO state_dicts for rank 27 -loading 8 zero partition checkpoints for rank 106 -successfully loaded 8 ZeRO state_dicts for rank 6 -loading 8 zero partition checkpoints for rank 80 -loading 8 zero partition checkpoints for rank 105 -successfully loaded 8 ZeRO state_dicts for rank 19 -successfully loaded 8 ZeRO state_dicts for rank 244 -loading 8 zero partition checkpoints for rank 174 -successfully loaded 8 ZeRO state_dicts for rank 236 -loading 8 zero partition checkpoints for rank 84 -loading 8 zero partition checkpoints for rank 100 -loading 8 zero partition checkpoints for rank 196 -loading 8 zero partition checkpoints for rank 202 -loading 8 zero partition checkpoints for rank 136 -successfully loaded 8 ZeRO state_dicts for rank 3 -successfully loaded 8 ZeRO state_dicts for rank 226 -successfully loaded 8 ZeRO state_dicts for rank 24 -successfully loaded 8 ZeRO state_dicts for rank 246 -successfully loaded 8 ZeRO state_dicts for rank 255 -loading 8 zero partition checkpoints for rank 83 -loading 8 zero partition checkpoints for rank 112 -loading 8 zero partition checkpoints for rank 38 -loading 8 zero partition checkpoints for rank 130 -loading 8 zero partition checkpoints for rank 138 -loading 8 zero partition checkpoints for rank 55 -successfully loaded 8 ZeRO state_dicts for rank 15 -successfully loaded 8 ZeRO state_dicts for rank 243 -successfully loaded 8 ZeRO state_dicts for rank 251 -loading 8 zero partition checkpoints for rank 54 -loading 8 zero partition checkpoints for rank 171 -loading 8 zero partition checkpoints for rank 158 -loading 8 zero partition checkpoints for rank 81 -loading 8 zero partition checkpoints for rank 90 -successfully loaded 8 ZeRO state_dicts for rank 252 -loading 8 zero partition checkpoints for rank 215 -successfully loaded 8 ZeRO state_dicts for rank 247 -loading 8 zero partition checkpoints for rank 60 -loading 8 zero partition checkpoints for rank 96 -loading 8 zero partition checkpoints for rank 170 -loading 8 zero partition checkpoints for rank 73 -successfully loaded 8 ZeRO state_dicts for rank 227 -loading 8 zero partition checkpoints for rank 103 -successfully loaded 8 ZeRO state_dicts for rank 241 -loading 8 zero partition checkpoints for rank 216 -loading 8 zero partition checkpoints for rank 211 -loading 8 zero partition checkpoints for rank 169 -successfully loaded 8 ZeRO state_dicts for rank 229 -loading 8 zero partition checkpoints for rank 51 -successfully loaded 8 ZeRO state_dicts for rank 4 -loading 8 zero partition checkpoints for rank 117 -successfully loaded 8 ZeRO state_dicts for rank 230 -successfully loaded 8 ZeRO state_dicts for rank 17 -loading 8 zero partition checkpoints for rank 213 -successfully loaded 8 ZeRO state_dicts for rank 242 -successfully loaded 8 ZeRO state_dicts for rank 250 -loading 8 zero partition checkpoints for rank 208 -successfully loaded 8 ZeRO state_dicts for rank 225 -successfully loaded 8 ZeRO state_dicts for rank 9 -successfully loaded 8 ZeRO state_dicts for rank 1 -successfully loaded 8 ZeRO state_dicts for rank 11 -successfully loaded 8 ZeRO state_dicts for rank 7 -successfully loaded 8 ZeRO state_dicts for rank 253 -successfully loaded 8 ZeRO state_dicts for rank 237 -loading 8 zero partition checkpoints for rank 70 -loading 8 zero partition checkpoints for rank 107 -successfully loaded 8 ZeRO state_dicts for rank 5 -loading 8 zero partition checkpoints for rank 48 -successfully loaded 8 ZeRO state_dicts for rank 245 -loading 8 zero partition checkpoints for rank 58 -loading 8 zero partition checkpoints for rank 67 -loading 8 zero partition checkpoints for rank 135 -successfully loaded 8 ZeRO state_dicts for rank 234 -loading 8 zero partition checkpoints for rank 109 -loading 8 zero partition checkpoints for rank 150 -successfully loaded 8 ZeRO state_dicts for rank 25 -loading 8 zero partition checkpoints for rank 44 -loading 8 zero partition checkpoints for rank 62 -loading 8 zero partition checkpoints for rank 50 -loading 8 zero partition checkpoints for rank 32 -successfully loaded 8 ZeRO state_dicts for rank 10 -loading 8 zero partition checkpoints for rank 132 -loading 8 zero partition checkpoints for rank 209 -loading 8 zero partition checkpoints for rank 205 -successfully loaded 8 ZeRO state_dicts for rank 249 -loading 8 zero partition checkpoints for rank 139 -loading 8 zero partition checkpoints for rank 127 -successfully loaded 8 ZeRO state_dicts for rank 233 -loading 8 zero partition checkpoints for rank 157 -successfully loaded 8 ZeRO state_dicts for rank 235 -loading 8 zero partition checkpoints for rank 134 -loading 8 zero partition checkpoints for rank 143 -loading 8 zero partition checkpoints for rank 175 -loading 8 zero partition checkpoints for rank 131 -successfully loaded 8 ZeRO state_dicts for rank 254 -loading 8 zero partition checkpoints for rank 94 -loading 8 zero partition checkpoints for rank 214 -loading 8 zero partition checkpoints for rank 120 -loading 8 zero partition checkpoints for rank 144 -loading 8 zero partition checkpoints for rank 36 -successfully loaded 8 ZeRO state_dicts for rank 239 -loading 8 zero partition checkpoints for rank 108 -successfully loaded 8 ZeRO state_dicts for rank 238 -loading 8 zero partition checkpoints for rank 61 -loading 8 zero partition checkpoints for rank 129 -loading 8 zero partition checkpoints for rank 63 -loading 8 zero partition checkpoints for rank 152 -loading 8 zero partition checkpoints for rank 137 -successfully loaded 8 ZeRO state_dicts for rank 8 -loading 8 zero partition checkpoints for rank 219 -loading 8 zero partition checkpoints for rank 151 -loading 8 zero partition checkpoints for rank 195 -loading 8 zero partition checkpoints for rank 164 -loading 8 zero partition checkpoints for rank 203 -loading 8 zero partition checkpoints for rank 210 -loading 8 zero partition checkpoints for rank 114 -loading 8 zero partition checkpoints for rank 212 -loading 8 zero partition checkpoints for rank 101 -loading 8 zero partition checkpoints for rank 180 -loading 8 zero partition checkpoints for rank 218 -loading 8 zero partition checkpoints for rank 172 -loading 8 zero partition checkpoints for rank 125 -loading 8 zero partition checkpoints for rank 99 -loading 8 zero partition checkpoints for rank 95 -loading 8 zero partition checkpoints for rank 153 -loading 8 zero partition checkpoints for rank 68 -loading 8 zero partition checkpoints for rank 188 -loading 8 zero partition checkpoints for rank 98 -loading 8 zero partition checkpoints for rank 91 -loading 8 zero partition checkpoints for rank 33 -loading 8 zero partition checkpoints for rank 184 -loading 8 zero partition checkpoints for rank 182 -loading 8 zero partition checkpoints for rank 154 -loading 8 zero partition checkpoints for rank 178 -loading 8 zero partition checkpoints for rank 77 -loading 8 zero partition checkpoints for rank 89 -loading 8 zero partition checkpoints for rank 198 -loading 8 zero partition checkpoints for rank 85 -loading 8 zero partition checkpoints for rank 37 -loading 8 zero partition checkpoints for rank 97 -loading 8 zero partition checkpoints for rank 110 -loading 8 zero partition checkpoints for rank 66 -loading 8 zero partition checkpoints for rank 111 -loading 8 zero partition checkpoints for rank 161 -loading 8 zero partition checkpoints for rank 189 -loading 8 zero partition checkpoints for rank 147 -loading 8 zero partition checkpoints for rank 146 -loading 8 zero partition checkpoints for rank 116 -loading 8 zero partition checkpoints for rank 173 -loading 8 zero partition checkpoints for rank 59 -loading 8 zero partition checkpoints for rank 28 -loading 8 zero partition checkpoints for rank 221 -loading 8 zero partition checkpoints for rank 133 -loading 8 zero partition checkpoints for rank 201 -loading 8 zero partition checkpoints for rank 166 -loading 8 zero partition checkpoints for rank 148 -loading 8 zero partition checkpoints for rank 82 -loading 8 zero partition checkpoints for rank 87 -loading 8 zero partition checkpoints for rank 192 -loading 8 zero partition checkpoints for rank 79 -successfully loaded 8 ZeRO state_dicts for rank 13 -loading 8 zero partition checkpoints for rank 115 -loading 8 zero partition checkpoints for rank 181 -loading 8 zero partition checkpoints for rank 124 -loading 8 zero partition checkpoints for rank 193 -loading 8 zero partition checkpoints for rank 76 -loading 8 zero partition checkpoints for rank 119 -loading 8 zero partition checkpoints for rank 74 -loading 8 zero partition checkpoints for rank 186 -loading 8 zero partition checkpoints for rank 187 -loading 8 zero partition checkpoints for rank 72 -loading 8 zero partition checkpoints for rank 163 -loading 8 zero partition checkpoints for rank 2 -loading 8 zero partition checkpoints for rank 92 -loading 8 zero partition checkpoints for rank 149 -loading 8 zero partition checkpoints for rank 217 -loading 8 zero partition checkpoints for rank 88 -loading 8 zero partition checkpoints for rank 39 -loading 8 zero partition checkpoints for rank 69 -loading 8 zero partition checkpoints for rank 78 -loading 8 zero partition checkpoints for rank 199 -loading 8 zero partition checkpoints for rank 155 -loading 8 zero partition checkpoints for rank 176 -loading 8 zero partition checkpoints for rank 22 -loading 8 zero partition checkpoints for rank 49 -loading 8 zero partition checkpoints for rank 86 -loading 8 zero partition checkpoints for rank 34 -loading 8 zero partition checkpoints for rank 93 -loading 8 zero partition checkpoints for rank 102 -loading 8 zero partition checkpoints for rank 142 -loading 8 zero partition checkpoints for rank 56 -loading 8 zero partition checkpoints for rank 223 -loading 8 zero partition checkpoints for rank 160 -loading 8 zero partition checkpoints for rank 145 -loading 8 zero partition checkpoints for rank 179 -loading 8 zero partition checkpoints for rank 45 -loading 8 zero partition checkpoints for rank 159 -loading 8 zero partition checkpoints for rank 185 -loading 8 zero partition checkpoints for rank 113 -loading 8 zero partition checkpoints for rank 177 -loading 8 zero partition checkpoints for rank 183 -loading 8 zero partition checkpoints for rank 118 -loading 8 zero partition checkpoints for rank 71 -loading 8 zero partition checkpoints for rank 57 -loading 8 zero partition checkpoints for rank 18 -loading 8 zero partition checkpoints for rank 141 -loading 8 zero partition checkpoints for rank 122 -loading 8 zero partition checkpoints for rank 194 -loading 8 zero partition checkpoints for rank 222 -loading 8 zero partition checkpoints for rank 64 -loading 8 zero partition checkpoints for rank 162 -loading 8 zero partition checkpoints for rank 35 -loading 8 zero partition checkpoints for rank 29 -loading 8 zero partition checkpoints for rank 20 -loading 8 zero partition checkpoints for rank 191 -loading 8 zero partition checkpoints for rank 46 -loading 8 zero partition checkpoints for rank 126 -loading 8 zero partition checkpoints for rank 21 -loading 8 zero partition checkpoints for rank 190 -loading 8 zero partition checkpoints for rank 65 -loading 8 zero partition checkpoints for rank 23 -loading 8 zero partition checkpoints for rank 3 -loading 8 zero partition checkpoints for rank 6 -loading 8 zero partition checkpoints for rank 197 -loading 8 zero partition checkpoints for rank 226 -loading 8 zero partition checkpoints for rank 244 -loading 8 zero partition checkpoints for rank 121 -loading 8 zero partition checkpoints for rank 167 -loading 8 zero partition checkpoints for rank 252 -loading 8 zero partition checkpoints for rank 123 -loading 8 zero partition checkpoints for rank 241 -loading 8 zero partition checkpoints for rank 4 -loading 8 zero partition checkpoints for rank 227 -loading 8 zero partition checkpoints for rank 47 -loading 8 zero partition checkpoints for rank 251 -loading 8 zero partition checkpoints for rank 17 -loading 8 zero partition checkpoints for rank 165 -loading 8 zero partition checkpoints for rank 242 -loading 8 zero partition checkpoints for rank 253 -loading 8 zero partition checkpoints for rank 224 -loading 8 zero partition checkpoints for rank 250 -loading 8 zero partition checkpoints for rank 231 -loading 8 zero partition checkpoints for rank 229 -loading 8 zero partition checkpoints for rank 245 -loading 8 zero partition checkpoints for rank 230 -loading 8 zero partition checkpoints for rank 228 -loading 8 zero partition checkpoints for rank 255 -loading 8 zero partition checkpoints for rank 0 - checkpoint version 3.0 -loading 8 zero partition checkpoints for rank 234 -loading 8 zero partition checkpoints for rank 27 -loading 8 zero partition checkpoints for rank 233 -loading 8 zero partition checkpoints for rank 26 -loading 8 zero partition checkpoints for rank 225 -loading 8 zero partition checkpoints for rank 240 -loading 8 zero partition checkpoints for rank 30 -loading 8 zero partition checkpoints for rank 12 -loading 8 zero partition checkpoints for rank 16 -loading 8 zero partition checkpoints for rank 7 -loading 8 zero partition checkpoints for rank 5 -loading 8 zero partition checkpoints for rank 1 -loading 8 zero partition checkpoints for rank 31 -loading 8 zero partition checkpoints for rank 24 -loading 8 zero partition checkpoints for rank 246 -loading 8 zero partition checkpoints for rank 254 -loading 8 zero partition checkpoints for rank 19 -loading 8 zero partition checkpoints for rank 15 -loading 8 zero partition checkpoints for rank 14 -loading 8 zero partition checkpoints for rank 248 -loading 8 zero partition checkpoints for rank 243 -loading 8 zero partition checkpoints for rank 247 -loading 8 zero partition checkpoints for rank 9 -loading 8 zero partition checkpoints for rank 25 -loading 8 zero partition checkpoints for rank 232 -loading 8 zero partition checkpoints for rank 249 -loading 8 zero partition checkpoints for rank 11 -loading 8 zero partition checkpoints for rank 235 -loading 8 zero partition checkpoints for rank 236 -loading 8 zero partition checkpoints for rank 13 -loading 8 zero partition checkpoints for rank 237 -loading 8 zero partition checkpoints for rank 10 -loading 8 zero partition checkpoints for rank 239 -loading 8 zero partition checkpoints for rank 238 -loading 8 zero partition checkpoints for rank 8 - successfully loaded checkpoint from /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints at iteration 6210 -time (ms) | load-checkpoint: 56346.23 -[after model, optimizer, and learning rate scheduler are built] datetime: 2021-09-30 03:53:50 -> building train, validation, and test datasets ... - > datasets target sizes (minimum size): - train: 300000000 - validation: 1638400 - test: 10240 -> building train, validation, and test datasets for GPT ... - > building dataset index ... - reading sizes... - reading pointers... - reading document index... - creating numpy buffer of mmap... - creating memory view of numpy buffer... - > finished creating indexed dataset in 0.158410 seconds - number of documents: 304230423 - > dataset split: - train: - document indices in [0, 288714672) total of 288714672 documents - validation: - document indices in [288714672, 303926193) total of 15211521 documents - test: - document indices in [303926193, 304230423) total of 304230 documents - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_43s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_43s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_train_indexmap_300000000ns_2048sl_43s_shuffle_idx.npy - loaded indexed file in 0.256 seconds - total number of samples: 394611670 - total number of epochs: 3 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_43s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_43s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_valid_indexmap_1638400ns_2048sl_43s_shuffle_idx.npy - loaded indexed file in 0.235 seconds - total number of samples: 6927161 - total number of epochs: 1 - > loading doc-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_43s_doc_idx.npy - > loading sample-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_43s_sample_idx.npy - > loading shuffle-idx mapping from /gpfswork/rech/six/commun/datasets-custom/oscar-en/meg-gpt2_text_document_test_indexmap_10240ns_2048sl_43s_shuffle_idx.npy - loaded indexed file in 0.069 seconds - total number of samples: 137384 - total number of epochs: 1 -> finished creating GPT datasets ... -[after dataloaders are built] datetime: 2021-09-30 03:53:56 -done with setup ... -training ... -time (ms) | model-and-optimizer-setup: 64448.88 | train/valid/test-data-iterators-setup: 5454.03 -[before the start of training step] datetime: 2021-09-30 03:53:56 -[2021-09-30 03:53:56,830] [INFO] [checkpointing.py:408:forward] Activation Checkpointing Information -[2021-09-30 03:53:56,830] [INFO] [checkpointing.py:409:forward] ----Partition Activations False, CPU CHECKPOINTING False -[2021-09-30 03:53:56,830] [INFO] [checkpointing.py:412:forward] ----contiguous Memory Checkpointing False with 32 total layers -[2021-09-30 03:53:56,830] [INFO] [checkpointing.py:415:forward] ----Synchronization False -[2021-09-30 03:53:56,830] [INFO] [checkpointing.py:416:forward] ----Profiling time in checkpointing False -[Rank 225] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11917.68994140625 | reserved: 20752.0 | max reserved: 20752.0 -[Rank 1] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13931.01416015625 | reserved: 23374.0 | max reserved: 23374.0 -[Rank 0] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13931.01416015625 | reserved: 23310.0 | max reserved: 23310.0 -[Rank 224] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11917.68994140625 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 226] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11917.68994140625 | reserved: 22492.0 | max reserved: 22492.0 -[Rank 2] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13931.01416015625 | reserved: 22974.0 | max reserved: 22974.0 -[Rank 3] (after 6220 iterations) memory (MB) | allocated: 6689.83056640625 | max allocated: 13931.01416015625 | reserved: 22990.0 | max reserved: 22990.0 -[Rank 227] (after 6220 iterations) memory (MB) | allocated: 7107.7119140625 | max allocated: 11917.68994140625 | reserved: 20752.0 | max reserved: 20752.0 - iteration 6220/ 159576 | consumed samples: 194400 | elapsed time per iteration (ms): 30069.8 | learning rate: 5.378E-05 | global batch size: 80 | lm loss: 6.355436E+00 | loss scale: 4096.0 | grad norm: 132438.701 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[Rank 32] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12114.4677734375 | reserved: 20596.0 | max reserved: 20596.0 -[Rank 98] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11570.466796875 | reserved: 19870.0 | max reserved: 19870.0 -[Rank 130] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11298.46630859375 | reserved: 19434.0 | max reserved: 19434.0 -[Rank 162] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11026.4658203125 | reserved: 19066.0 | max reserved: 19066.0 -[Rank 66] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11842.46728515625 | reserved: 20670.0 | max reserved: 20670.0 -[Rank 194] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10754.46533203125 | reserved: 19054.0 | max reserved: 19054.0 -[Rank 34] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12114.4677734375 | reserved: 20170.0 | max reserved: 20170.0 -[Rank 96] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11570.466796875 | reserved: 20388.0 | max reserved: 20388.0 -[Rank 128] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11298.46630859375 | reserved: 19988.0 | max reserved: 19988.0 -[Rank 192] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10754.46533203125 | reserved: 19480.0 | max reserved: 19480.0 -[Rank 160] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11026.4658203125 | reserved: 19572.0 | max reserved: 19572.0 -[Rank 64] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11842.46728515625 | reserved: 20452.0 | max reserved: 20452.0 -[Rank 67] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11842.46728515625 | reserved: 20062.0 | max reserved: 20062.0 -[Rank 35] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12114.4677734375 | reserved: 20374.0 | max reserved: 20374.0 -[Rank 163] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11026.4658203125 | reserved: 19226.0 | max reserved: 19226.0 -[Rank 195] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10754.46533203125 | reserved: 18970.0 | max reserved: 18970.0 -[Rank 131] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11298.46630859375 | reserved: 19582.0 | max reserved: 19582.0 -[Rank 99] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11570.466796875 | reserved: 19886.0 | max reserved: 19886.0 -[Rank 97] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11570.466796875 | reserved: 19870.0 | max reserved: 19870.0 -[Rank 129] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11298.46630859375 | reserved: 19562.0 | max reserved: 19562.0 -[Rank 65] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11842.46728515625 | reserved: 20670.0 | max reserved: 20670.0 -[Rank 161] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 11026.4658203125 | reserved: 19230.0 | max reserved: 19230.0 -[Rank 33] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 12114.4677734375 | reserved: 20170.0 | max reserved: 20170.0 -[Rank 193] (after 6220 iterations) memory (MB) | allocated: 5861.55029296875 | max allocated: 10754.46533203125 | reserved: 19054.0 | max reserved: 19054.0 - iteration 6230/ 159576 | consumed samples: 195200 | elapsed time per iteration (ms): 29715.9 | learning rate: 5.400E-05 | global batch size: 80 | lm loss: 6.325600E+00 | loss scale: 4096.0 | grad norm: 93189.900 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6240/ 159576 | consumed samples: 196000 | elapsed time per iteration (ms): 29850.6 | learning rate: 5.423E-05 | global batch size: 80 | lm loss: 6.314528E+00 | loss scale: 4096.0 | grad norm: 153013.405 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.154311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.693372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.853277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.842906 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.156230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.007043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.141515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.998821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.978375 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.153948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.046725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.105339 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.839469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.886734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.846006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.129153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.156799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.917236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.199060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.036142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.135256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.068995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.104105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.945723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.807536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.878620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.885677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.967674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.949363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.167258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.095681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.905876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.070411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.200017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.000211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.878945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:10.053487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.966731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.867150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:11:21 CEST)" was missed by 0:00:09.998094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.832527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.900956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.959521 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.693701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.740910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.700204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:09.008585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.547640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.707561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:09.053319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.861273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.995798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:09.054171 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.853107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.890343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.821832 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.989445 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:09.008173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.923175 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.983410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:09.011032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.771456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.732863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:09.010526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.854397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.760050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.803591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.949886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:09.021500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.958312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.799978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.820922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.661789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.697242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.739874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.924615 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.907646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.733121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.721307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:12:21 CEST)" was missed by 0:00:08.852267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.436041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.611707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.526661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.563051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.297217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.344454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.303732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.612125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.614514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.151174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.311093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.656833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.614044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.464781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.599319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.657678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.456642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.493848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.363548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.425338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.528064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.592991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.553354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.624952 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.561765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.504510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.403512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.424427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.586926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.265296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.300740 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.375030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.336389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.343398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.457902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.511153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.407115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.336631 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.324806 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:13:21 CEST)" was missed by 0:00:07.455805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.805411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.673073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.981482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.520498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.680458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:05.026200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.834157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.968697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:05.027056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.825991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.863229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.794725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.962332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.981080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.896079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.922735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.994312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.873866 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.932435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.666589 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.713819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.793773 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.956314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.983905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.634635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.670132 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.744374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.983402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.827251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.732936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.897469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.776498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.931203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.772888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.705977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.694159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.705781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.712784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.880530 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:14:21 CEST)" was missed by 0:00:04.825162 - iteration 6250/ 159576 | consumed samples: 196800 | elapsed time per iteration (ms): 29009.9 | learning rate: 5.445E-05 | global batch size: 80 | lm loss: 6.303601E+00 | loss scale: 4096.0 | grad norm: 137433.627 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6260/ 159576 | consumed samples: 197600 | elapsed time per iteration (ms): 28924.6 | learning rate: 5.467E-05 | global batch size: 80 | lm loss: 6.323338E+00 | loss scale: 4096.0 | grad norm: 108774.777 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6270/ 159576 | consumed samples: 198400 | elapsed time per iteration (ms): 29624.6 | learning rate: 5.489E-05 | global batch size: 80 | lm loss: 6.321053E+00 | loss scale: 4096.0 | grad norm: 100365.177 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6280/ 159576 | consumed samples: 199200 | elapsed time per iteration (ms): 29739.7 | learning rate: 5.511E-05 | global batch size: 80 | lm loss: 6.322646E+00 | loss scale: 4096.0 | grad norm: 175808.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.560016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.719904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.709566 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.783853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.752231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.772344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.712575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.745258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.812421 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.745427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.674153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.753402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:32:21 CEST)" was missed by 0:00:10.733674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.210198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.324252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.370124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.434048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.462579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.359781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.395448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.402439 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.422610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.403588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.362847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.515743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.395686 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.658400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.646049 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.671311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.673678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.670851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.466245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.622282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.356399 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.383884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.495313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.563735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.673182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.716023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.523979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.484543 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.620971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.716913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.585955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.612592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.553056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.587290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.652254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.684235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.517143 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.483674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.570397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:33:21 CEST)" was missed by 0:00:08.515050 - iteration 6290/ 159576 | consumed samples: 200000 | elapsed time per iteration (ms): 28954.3 | learning rate: 5.534E-05 | global batch size: 80 | lm loss: 6.273073E+00 | loss scale: 4096.0 | grad norm: 97252.912 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.288614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.464212 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.149769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.467041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.003718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.317340 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.451823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.510241 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.309182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.346390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.380586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.259652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.379270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.405920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.414289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.357097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.415641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.256107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.439463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.464737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.163626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.117813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.153313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.227531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.195935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.509440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.466579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.310490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.216089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.277924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.477554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.197087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.156313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.276976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.188987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.445633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.189215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.177346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.308340 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:34:21 CEST)" was missed by 0:00:06.363687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.648136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.611939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.346096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.635765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.663381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.200096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.314108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.423849 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.705723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.513679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.505496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.542720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.484985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.660576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.455982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.575642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.602261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.553415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.393356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.452439 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.661072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.360007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.385266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.392242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.706628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.412466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.474268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.577013 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.641928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.673914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.352700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.349662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.662961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.506846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.610729 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.473366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.385525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.560141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.373783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:35:21 CEST)" was missed by 0:00:04.504787 - iteration 6300/ 159576 | consumed samples: 200800 | elapsed time per iteration (ms): 29277.3 | learning rate: 5.556E-05 | global batch size: 80 | lm loss: 6.295372E+00 | loss scale: 4096.0 | grad norm: 153910.916 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 6300 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-30 04:38:25,123] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step6300/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 6300 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 18667.92 -[2021-09-30 04:42:10] PULSE: tr8-104B is running for 50:03 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.963852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.090832 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.142243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.678945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.864251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.828453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.902679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.871111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.992544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.127072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.985620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.984376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.891238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.953068 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.139466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.934822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.054483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.152760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.089548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.032265 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.831478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.931277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.114704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.139945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.838871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.793026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.184644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.141780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.185417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.021588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.120823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.081177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.825079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.872282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.952198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.864201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.038919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:11.055901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.983584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:43:21 CEST)" was missed by 0:00:10.852635 - iteration 6310/ 159576 | consumed samples: 201600 | elapsed time per iteration (ms): 31055.8 | learning rate: 5.578E-05 | global batch size: 80 | lm loss: 6.324059E+00 | loss scale: 4096.0 | grad norm: 124591.778 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.738472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.916856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.453527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.767151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.901695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.960038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.760247 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.759012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.796167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.665853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.914081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.709466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.864119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.806929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.865490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.705922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.889311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.914550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.638906 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.567640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.603094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.677329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.959238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.916405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.727715 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.830442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.813480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.829148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.927410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.599690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.646915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.606165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.726887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.613497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.627127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.638827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.645783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.895467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.758118 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:44:21 CEST)" was missed by 0:00:09.855861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.645084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.444049 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.350932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.423550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.601918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.138600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.288152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.362393 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.601444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.452243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.586762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.481262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.515485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.599178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.394550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.514136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.540861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.612416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.549186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.491972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.550552 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.284746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.291169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.391012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.574386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.599623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.323987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.298543 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.252696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.323868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.644302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.445363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.412804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.580514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.498539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.331965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.411902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.312190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.330874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:45:21 CEST)" was missed by 0:00:07.443200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.041857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.578500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.026635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.085043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.885229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.883961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.790857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.863493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.931870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.990495 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.014268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.039506 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.692615 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.728070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.802336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.084205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.041371 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.892144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.921173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.852736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.955440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.938470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.039078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.834459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.954112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.980806 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.052355 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.989123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.724696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.731123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.830931 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.763914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.738478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.770788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.883124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.771905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.851850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.752137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:05.763825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:46:21 CEST)" was missed by 0:00:06.020483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.102876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.064855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.092455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.134795 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.077252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.135608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.089670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.004667 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.039735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.041067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.090114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.091998 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.006044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.031397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:47:21 CEST)" was missed by 0:00:03.071041 - iteration 6320/ 159576 | consumed samples: 202400 | elapsed time per iteration (ms): 28833.6 | learning rate: 5.600E-05 | global batch size: 80 | lm loss: 6.299813E+00 | loss scale: 4096.0 | grad norm: 122818.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6330/ 159576 | consumed samples: 203200 | elapsed time per iteration (ms): 29174.3 | learning rate: 5.622E-05 | global batch size: 80 | lm loss: 6.322478E+00 | loss scale: 4096.0 | grad norm: 120418.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.253093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.408180 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.171416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.203590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.300998 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.359582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.199992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.395825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.358264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.383438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.411079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:10.947749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.061786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.097254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.232696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.389556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.093801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.453365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.454255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.307618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.141045 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.410560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.261363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.324609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.323301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.349991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.421554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.100335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.408756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.132960 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.254470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.290414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.160106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.221951 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.221009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.133140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.107698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.121276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.139973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 04:58:21 CEST)" was missed by 0:00:11.252277 - iteration 6340/ 159576 | consumed samples: 204000 | elapsed time per iteration (ms): 29051.4 | learning rate: 5.645E-05 | global batch size: 80 | lm loss: 6.316248E+00 | loss scale: 4096.0 | grad norm: 133284.538 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6350/ 159576 | consumed samples: 204800 | elapsed time per iteration (ms): 27117.2 | learning rate: 5.664E-05 | global batch size: 80 | lm loss: 6.308830E+00 | loss scale: 4096.0 | grad norm: 104470.093 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.858272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.906190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.964757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.776652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:11.013436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.808771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.963400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.805240 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.988573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:11.016220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.552907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.702416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.866489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:11.001012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:11.059408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.765238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.929737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.994718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.912737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.955145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:11.026697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.699024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.705453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:11.013895 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.738255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.712830 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.666999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.726413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.745120 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:11.058558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:11.015748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.859628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.895580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.837860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.827115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.857389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.928492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.746221 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.826172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:03:21 CEST)" was missed by 0:00:10.738144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.544306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.654291 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.496484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.475942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.651557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.593186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.664772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.602901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.343552 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.652003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.191015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.414785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.504596 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.639151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.697521 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.497705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.533664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.403327 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.446910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.566575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.443386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.464261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.626770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.305111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.340594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.383219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.696698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.653882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.465197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.601646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.337182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.384368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.376410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.350991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.376308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.632900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.550987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.567997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.364638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:04:21 CEST)" was missed by 0:00:09.495635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.357490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.309581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.477886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.414695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.467452 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.227950 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.510663 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.216477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.464731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.416080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.156737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.004174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.118245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.317759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.452289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.310866 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.289131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.278334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.381042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.364036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.260088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.379750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.406383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.150297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.256564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.439936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.153747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.177686 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.196389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.509855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.467049 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.346837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.197555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.277451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.465213 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.189563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.164154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.189452 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.446079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:05:21 CEST)" was missed by 0:00:06.308727 - iteration 6360/ 159576 | consumed samples: 205600 | elapsed time per iteration (ms): 28711.8 | learning rate: 5.687E-05 | global batch size: 80 | lm loss: 6.295247E+00 | loss scale: 4096.0 | grad norm: 141519.847 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6370/ 159576 | consumed samples: 206400 | elapsed time per iteration (ms): 28608.5 | learning rate: 5.709E-05 | global batch size: 80 | lm loss: 6.339661E+00 | loss scale: 4096.0 | grad norm: 73871.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6380/ 159576 | consumed samples: 207200 | elapsed time per iteration (ms): 26950.3 | learning rate: 5.729E-05 | global batch size: 80 | lm loss: 6.321135E+00 | loss scale: 2048.0 | grad norm: 41452.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.127092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.284941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:09.821696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.282250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.175038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.233611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.045497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.269806 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.328219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.074080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.106672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.077633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:09.971266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.013910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.327394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.034037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.181614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.223948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:09.974261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.094959 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.007096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.135321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.128424 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.164398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.197302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.295558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.232300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:09.935843 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.284569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.095933 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:09.967880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.015111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.257525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.282764 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:09.981679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:09.995282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.198646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.263681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.126272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:18:21 CEST)" was missed by 0:00:10.007079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.781375 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.887890 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.939285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.924099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.982523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.829374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.911768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.699826 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.688354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.760979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.936604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.628560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.476055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.590106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.625579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.981678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.782743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.750191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.949848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.886630 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.622139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.937037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.661413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.668239 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.789640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.818720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.852942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.835935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.731984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.851616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.878300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.669372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.728429 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.636000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.661290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.938898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.749334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.649610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.917994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:19:21 CEST)" was missed by 0:00:07.780626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.460131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.602843 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.566623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.618028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.268809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.378539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.660409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.661244 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.367078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.615326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.565330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.508126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.154798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.468365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.531660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.628567 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.300865 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.590528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.304329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.461484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.439752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.514683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.410697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.530338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.557013 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.348089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.407159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.340154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.497459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.428966 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.307342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.428046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.615835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.314754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.328329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.340010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.347008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.617685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.596693 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:20:21 CEST)" was missed by 0:00:05.459345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.022709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.081160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.035233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.010417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.037985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.080305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.048494 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.035719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.037563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:21:21 CEST)" was missed by 0:00:03.016621 - iteration 6390/ 159576 | consumed samples: 208000 | elapsed time per iteration (ms): 28738.4 | learning rate: 5.751E-05 | global batch size: 80 | lm loss: 6.297319E+00 | loss scale: 2048.0 | grad norm: 49955.717 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6400/ 159576 | consumed samples: 208800 | elapsed time per iteration (ms): 28798.3 | learning rate: 5.773E-05 | global batch size: 80 | lm loss: 6.308155E+00 | loss scale: 2048.0 | grad norm: 31443.708 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.866637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.973111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.067688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.773515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.021771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.024572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.971804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.813607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.561285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.746570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.785025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.835384 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.936780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.914595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.713813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.009372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.867941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.817157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.963453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.938147 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.035073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.707364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.754582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.834514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.022270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.746466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.753489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.066959 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.874872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.846270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.003145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.997054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.903979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.921190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.721234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.675424 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.865840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.710870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:11.024162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:31:21 CEST)" was missed by 0:00:10.734891 - iteration 6410/ 159576 | consumed samples: 209600 | elapsed time per iteration (ms): 29052.9 | learning rate: 5.795E-05 | global batch size: 80 | lm loss: 6.333957E+00 | loss scale: 2048.0 | grad norm: 36283.153 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.902182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.008655 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.103251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.809071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.057324 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.007316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.950103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.060133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.820554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.044905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.973643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.070592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.742884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.790092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.849158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.032568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.596833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.782123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.903482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.870959 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.972352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.999021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.710912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.782009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.102484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.910425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.881801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.038661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.956677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.852732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.749384 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.057839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.789048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.939542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.901329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.870072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.756790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.746450 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:07.770373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:32:21 CEST)" was missed by 0:00:08.059749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.261224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.367652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.149050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.179549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.462273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.262494 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.168109 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.416338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.331333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.429591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.366357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.309130 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.101867 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.108317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.208158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.391598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.416796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.419126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:05.955827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.141140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.461512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.269432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.403924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.240782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.229985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.332689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.397643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.211702 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.358017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.229062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.115739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.069951 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.105406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.140999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.148044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.418710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.298508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.260372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.315738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:33:21 CEST)" was missed by 0:00:06.129395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.156218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.311333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.262661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.074555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.298888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.357289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.157474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.063079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.227650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.226345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.324619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.261358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.204166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.044083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.286576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.314156 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.036132 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.356482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.135793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.124953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.106737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.253046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.003409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.103194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.311822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.010767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.036014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.164463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.193527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.292692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.155377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.210747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.000434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.024387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.043073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.313742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:34:21 CEST)" was missed by 0:00:03.124127 - iteration 6420/ 159576 | consumed samples: 210400 | elapsed time per iteration (ms): 28687.3 | learning rate: 5.818E-05 | global batch size: 80 | lm loss: 6.311902E+00 | loss scale: 2048.0 | grad norm: 48812.385 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6430/ 159576 | consumed samples: 211200 | elapsed time per iteration (ms): 28644.5 | learning rate: 5.840E-05 | global batch size: 80 | lm loss: 6.339233E+00 | loss scale: 2048.0 | grad norm: 73811.722 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.672342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.623708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.435576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.517243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.467731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.565152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.675177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.211888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.659972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.685664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.464197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.717515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.587408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.622460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.672873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.325988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.404075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.718366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.518581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.424191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.486052 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.614086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.405179 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.485113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.397237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.361446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.397077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.525503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.496853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.653741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.357965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.364440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.647656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.554588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.588789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.571800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.371846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.516442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.385503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:42:21 CEST)" was missed by 0:00:09.674794 -[2021-09-30 05:42:11] PULSE: tr8-104B is running for 1:50:04 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.180463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.223619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.022614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.929490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.177728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.092656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.190934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.070528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.129060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.940945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.909354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.030754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.165341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.023859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.002113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.973093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.119370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.863236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.910504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.869706 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.990408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.717225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.902538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.831310 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.222856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.059851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.991364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.127787 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.969580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.152986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.178219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.866781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.902411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.180104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.094123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.159064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.877185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.077150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:07.890794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:43:21 CEST)" was missed by 0:00:08.021810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.229227 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.280684 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.041113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.122834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.277886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.291169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.170674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:07.963441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.278344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:07.817381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.323869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.102340 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.073263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.192912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.219576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.227965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:07.969932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.069726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:07.966936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.002578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.009560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.130984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.265562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.124098 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.160072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.029742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.259255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.010692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.090648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.002775 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:07.977333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:07.931537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.323081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.091595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.194312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.177336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.253227 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.280309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:07.990999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:44:21 CEST)" was missed by 0:00:08.122000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.177887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.976881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.083312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.134791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.978141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.883753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.132026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.046995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.145235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.856786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.895252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.119624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.014141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.956423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.073656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.024838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.864778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.824002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.944703 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.671503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.785593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.863666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.985082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.945648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.927420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.082100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.817542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.923871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.132473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.821051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.177184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.113364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.107290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.831454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.856711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.048442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.031459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:07.134413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.976133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:45:21 CEST)" was missed by 0:00:06.845142 - iteration 6440/ 159576 | consumed samples: 212000 | elapsed time per iteration (ms): 29237.2 | learning rate: 5.862E-05 | global batch size: 80 | lm loss: 6.297226E+00 | loss scale: 2048.0 | grad norm: 57023.083 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.821089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.620113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.526937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.668000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.726551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.777991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.499986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.538444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.762838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.621338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.599598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.691526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.775209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.690197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.716868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.788454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.725208 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.460744 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.507986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.775660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.314703 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.464247 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.506861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.820373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.628281 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.657344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.588848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.756563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.674581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.570606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.467226 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.567060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.587930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.750513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.474643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.428814 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.488222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.499905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.777603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:46:21 CEST)" was missed by 0:00:05.619228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.443264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.537854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.243717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.505190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.494780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.216720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.255198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.338103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.336886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.491985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.406954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.433607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.384789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.177502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.374091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.316395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.224720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.183976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.304654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.031474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.223627 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.479611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.305601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.442032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.283835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.492456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.145578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.181027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.216640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.537148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.345061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.408358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.473311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.391385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.287398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.467266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.191442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.494383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.205054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:47:21 CEST)" was missed by 0:00:05.336036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.943631 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.894989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.946453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.706859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.989577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.788607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.695460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.768083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.739036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.858653 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.885315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.956937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.836440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.629192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.483161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.668458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.675316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.796734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.931317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.789836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.825799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.924978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.676445 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.635683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.735483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.756389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.944150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.643097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.597297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.632693 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.668333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.988839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.946038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.757331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.893808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.918963 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.860116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.843136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.787763 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:48:21 CEST)" was missed by 0:00:05.656820 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.416283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.321712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.373167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.133602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.216536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.215299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.252499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.122141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.194800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.370383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.285357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.312010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.383612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.320396 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.263184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.055936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.103158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.062370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:03.909867 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.095176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.059395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.102030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.415569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.223472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.358015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.184031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.286724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.351724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.269751 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.165786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.162246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.183079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.345690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.370865 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.069793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.023995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.083416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.095087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.372765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 05:49:21 CEST)" was missed by 0:00:04.214409 - iteration 6450/ 159576 | consumed samples: 212800 | elapsed time per iteration (ms): 29595.0 | learning rate: 5.884E-05 | global batch size: 80 | lm loss: 6.299403E+00 | loss scale: 2048.0 | grad norm: 65910.989 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6460/ 159576 | consumed samples: 213600 | elapsed time per iteration (ms): 29891.0 | learning rate: 5.906E-05 | global batch size: 80 | lm loss: 6.318707E+00 | loss scale: 2048.0 | grad norm: 76118.048 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6470/ 159576 | consumed samples: 214400 | elapsed time per iteration (ms): 29671.9 | learning rate: 5.929E-05 | global batch size: 80 | lm loss: 6.299670E+00 | loss scale: 2048.0 | grad norm: 59518.850 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6480/ 159576 | consumed samples: 215200 | elapsed time per iteration (ms): 29322.0 | learning rate: 5.951E-05 | global batch size: 80 | lm loss: 6.325890E+00 | loss scale: 2048.0 | grad norm: 50644.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6490/ 159576 | consumed samples: 216000 | elapsed time per iteration (ms): 30024.9 | learning rate: 5.973E-05 | global batch size: 80 | lm loss: 6.311376E+00 | loss scale: 2048.0 | grad norm: 71729.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6500/ 159576 | consumed samples: 216800 | elapsed time per iteration (ms): 30086.9 | learning rate: 5.995E-05 | global batch size: 80 | lm loss: 6.319954E+00 | loss scale: 2048.0 | grad norm: 50618.616 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6510/ 159576 | consumed samples: 217600 | elapsed time per iteration (ms): 29295.7 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.319446E+00 | loss scale: 2048.0 | grad norm: 59473.520 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6520/ 159576 | consumed samples: 218400 | elapsed time per iteration (ms): 28023.0 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.312102E+00 | loss scale: 1024.0 | grad norm: 43424.343 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6530/ 159576 | consumed samples: 219200 | elapsed time per iteration (ms): 30083.9 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.307513E+00 | loss scale: 1024.0 | grad norm: 64246.025 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6540/ 159576 | consumed samples: 220000 | elapsed time per iteration (ms): 30333.9 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.300389E+00 | loss scale: 1024.0 | grad norm: 34309.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.610783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.410971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.378423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.567730 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.552497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.409822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.447014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.316657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.389328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.578167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.516270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.356717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.104430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.289729 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.296536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.610069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.564929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.479921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.506582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.514983 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.457765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.250448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.256936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.377617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.253944 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.546273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.360338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.565404 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.218538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.418015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.481307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.540199 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.328250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.464346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.297774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.264372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.289624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.278015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.567323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:39:21 CEST)" was missed by 0:00:03.409016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.095482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.952818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.059313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.899754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.110796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.647445 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.153040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.153892 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.990078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.859763 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.921565 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.107941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.121256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.000743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.083176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.108389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.761512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.871222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.961059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.954152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.932394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.903320 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.793537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.807391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.797012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.839636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.089321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.023048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.049684 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.058101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.840790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.800054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.832840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.832669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.110348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.920745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.024430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:05.007458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.821121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:40:21 CEST)" was missed by 0:00:04.952120 - iteration 6550/ 159576 | consumed samples: 220800 | elapsed time per iteration (ms): 30501.9 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.314403E+00 | loss scale: 1024.0 | grad norm: 30470.176 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.386905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.185872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.292335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.343830 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.386123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.328562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.187159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.223091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.092790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.165411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.154586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.256025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.354264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.132836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:05.994559 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.072631 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.194078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.341067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.282696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.233863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.026563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.033046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.153726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.316230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:05.880554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.065862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.030086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.322360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.136434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.291114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.073845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.341529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.104330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.065727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.257448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.040491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.240485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.343433 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.054128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:41:21 CEST)" was missed by 0:00:06.185134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.841964 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.640933 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.609632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.841160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.783632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.642205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.547828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.796103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.591443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.711090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.737742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.809317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.688893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.488084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.587905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.608774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.771273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.335599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.449629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.559354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.746142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.796555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.520906 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.485129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.712474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.495539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.695523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.798489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.640168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.509185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.747693 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.799182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.527999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.678473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.620776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.481930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.649452 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.529168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.777713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:42:21 CEST)" was missed by 0:00:06.521077 -[2021-09-30 06:42:15] PULSE: tr8-104B is running for 2:50:08 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.661339 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.767808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.819301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.861590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.804060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.862410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.662600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.698576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.568258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.640896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.630060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.731476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.829712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.502001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.608311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.505518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.548100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.669567 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.816541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.758153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.766550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.709320 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.508511 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.629207 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.791725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.816985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.356025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.541348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.470092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.579756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.732875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.797822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.715899 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.611889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.549298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.515939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.541203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.818876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.529570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:43:21 CEST)" was missed by 0:00:06.660572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.185299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.243652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.042616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.149108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.200577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.242847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.043914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.079867 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.949531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.022166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.011341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.197798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.090584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.883295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.989590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.737295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.851317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.961040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.929405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.050849 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.993154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.112804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.211059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.147850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.930571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.172989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.922626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.886820 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.114170 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.179112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.139470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.889840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.010516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.198262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.922478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.097194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.897236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.200171 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:06.910868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:44:21 CEST)" was missed by 0:00:07.041867 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.710719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.478403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.616193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.456627 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.318412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.428087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.709967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.652435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.510964 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.509725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.546961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.416597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.489271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.664863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.678108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.614875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.557685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.640092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.667685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.204374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.389672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.396487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.517938 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.581198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.564232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.460224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.579878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.606548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.350402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.397660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.356901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.477602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.665346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.353901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.389568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.646178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.364301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.377887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.667240 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:45:21 CEST)" was missed by 0:00:07.508910 - iteration 6560/ 159576 | consumed samples: 221600 | elapsed time per iteration (ms): 30277.7 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.306966E+00 | loss scale: 1024.0 | grad norm: 27994.872 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.101563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.239360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.290825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.333122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.275614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.333908 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.134140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.132914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.170101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.039771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.112406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.180832 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:07.980029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.079841 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:07.827544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.012831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:07.941577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.051269 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.019638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.141076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.269351 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.288065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.083403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.203051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.229694 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.301282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.238132 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:07.973572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.020849 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.100762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.263260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.288527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:07.987462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:07.977094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.204458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.012745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.290414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.187495 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.001168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:46:21 CEST)" was missed by 0:00:08.132191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.749619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.549838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.548609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.495488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.357278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.466977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.748844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.691313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.455484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.517284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.703740 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.618747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.716998 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.596535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.395754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.704189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.243230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.428566 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.403123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.392766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.620106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.499089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.645409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.653803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.516474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.678967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.706096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.603149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.416793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.547792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.655654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.707143 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.586415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.528711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.435949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.389842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.437077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.429012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.557420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:47:21 CEST)" was missed by 0:00:10.685654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.613708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.672035 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.471028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.377884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.439700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.626136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.518930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.577507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.417929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.628975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.279686 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.389368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.671274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.479229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.472275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.508262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.450548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.421508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.541163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.567819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.639412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.311687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.358955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.318160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.438910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.601372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.165680 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.350973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.325582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.315206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.350839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.357818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.607512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.576275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.626655 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.542618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.525630 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.628564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.339306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:48:21 CEST)" was missed by 0:00:12.470287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.817252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.871588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.777234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.839036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.710960 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.679021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.788724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.820851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.758227 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.565018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.750316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.724859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.714510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.750126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.757142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.878566 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.717509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.838234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.924915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.738588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 06:49:21 CEST)" was missed by 0:00:12.869558 - iteration 6570/ 159576 | consumed samples: 222400 | elapsed time per iteration (ms): 30604.9 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.300595E+00 | loss scale: 1024.0 | grad norm: 26978.212 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6580/ 159576 | consumed samples: 223200 | elapsed time per iteration (ms): 30439.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.323712E+00 | loss scale: 1024.0 | grad norm: 23410.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6590/ 159576 | consumed samples: 224000 | elapsed time per iteration (ms): 30455.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.491868E+00 | loss scale: 1024.0 | grad norm: 23219.864 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6600/ 159576 | consumed samples: 224800 | elapsed time per iteration (ms): 30012.6 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.859193E+00 | loss scale: 1024.0 | grad norm: 21108.820 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 6600 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-30 07:05:19,729] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step6600/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 6600 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 17612.65 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.456136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.655762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.396401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.243892 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.692010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.750322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.550542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.549288 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.586486 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.528774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.517991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.704425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.517093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.707254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.429235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.467660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.436030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.749504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.499768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.717708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.389955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.496244 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.679634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.704910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.429073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.557477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.619463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.646129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.597281 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.357978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.685747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.437219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.393475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.706797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.654570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.417540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.620872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.403899 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.603910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:07:21 CEST)" was missed by 0:00:10.548570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.868069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.042113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.100442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.900700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.899443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.936633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.806305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.878872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.054544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.947319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.005901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.740051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.746546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.029728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.057357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.594070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.779352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.817751 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.786137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.099624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.907608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.849918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.969573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.067810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.846341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.867236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.055034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.708109 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.779202 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.970899 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.035879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.953937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.996253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.004630 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.787332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.753994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.743633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.767616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:09.056929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:08:21 CEST)" was missed by 0:00:08.898599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.539910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.552299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.409620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.564766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.250266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.328003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.609852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.610765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.578058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.457575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.516156 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.218318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.296402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.446910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.389188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.360138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.256834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.356583 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.411002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.316647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.378415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.479842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.297568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.377519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.567670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.104344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.264231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.289458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.417883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.481165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.514884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.565287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.289676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.546133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.464195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.506530 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.253886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.277858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.567176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:09:21 CEST)" was missed by 0:00:08.408888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.888715 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.901123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.959539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.758428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.737963 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.913569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.864927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.599071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.914011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.567087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.958651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.759764 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.795726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.727204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.863604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.806392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.605628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.705386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.916457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.453115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.676830 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.665443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.708948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.828644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.926885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.646363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.638453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.613035 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.638250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.645263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.766692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.829954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.894925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.726347 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.602664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.915953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.812992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.855330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.626669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:10:21 CEST)" was missed by 0:00:07.757675 - iteration 6610/ 159576 | consumed samples: 225600 | elapsed time per iteration (ms): 31558.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.826175E+00 | loss scale: 1024.0 | grad norm: 19041.763 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.892454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.043306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.831339 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.114048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.914251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.913020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.950233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.819905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.881709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.068115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.983124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.081386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.960940 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.019497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.753622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.760142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.859924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.607642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.792943 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.721688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.799749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.113237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.921222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.055727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.984487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.049449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.967518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.863485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.009842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.018187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.800913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.880858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.068591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.071005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.767577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.757201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.781179 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.792817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:09.070498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:11:21 CEST)" was missed by 0:00:08.912182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.549359 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.327811 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.478644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.266665 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.349599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.348330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.385573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.255230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.317008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.503444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.516727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.453466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.396262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.454788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.188952 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.195469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.295266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.503895 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.506311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.042955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.228279 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.157020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.216470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.235097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.548551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.356561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.491057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.419791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.484754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.298835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.418469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.236229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.316173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.202878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.192516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.228131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.505824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.347494 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.402861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:12:21 CEST)" was missed by 0:00:09.445183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.360427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.582014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.380926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.418193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.287898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.349668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.536063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.451081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.549347 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.487434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.221572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.228064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.511284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.538909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.075595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.189640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.299305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.581198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.523680 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.382246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.331443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.428899 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.327881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.348810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.536560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.260927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.235527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.260768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.267724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.389170 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.517409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.477788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.268884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.225171 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.538457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.452497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.486189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.435519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.249209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:13:21 CEST)" was missed by 0:00:09.380204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.126722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.905163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.080779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.772810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.055984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.844016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.125894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.933869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.068381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.926948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.925683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.962921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.832601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.894353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.876148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.995807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.094093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.973622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.032176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.766304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.872586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.081233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.083646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.620303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.805640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.780224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.734371 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.812455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.997192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.062128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.980219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.022518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.030864 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.813610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.893546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.769886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.793876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.805506 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:09.083176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:14:21 CEST)" was missed by 0:00:08.924904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:10.032774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.811248 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:10.000127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.938197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.672350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.678823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.962026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.750083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:10.031952 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.974434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.833030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.831713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.868993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.738660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.800428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.986859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.782229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.901850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.879689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.778664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.987303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.989730 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.526370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.640420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.839957 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.968153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.928567 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.719636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.799590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.711720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.686321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.675926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.699978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.711526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.718533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.989243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.936977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.903298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.886331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:15:21 CEST)" was missed by 0:00:09.830998 - iteration 6620/ 159576 | consumed samples: 226400 | elapsed time per iteration (ms): 30107.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.742154E+00 | loss scale: 1024.0 | grad norm: 28021.206 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.812963 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.613166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.611893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.591415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.580602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.682037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.780296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.718377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.459023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.742228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.306542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.530248 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.812149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.754626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.649181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.518824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.748355 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.767012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.562410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.659842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.452541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.558834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.767479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.769904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.491873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.420609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.491713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.498698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.620139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.683447 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.708758 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.717126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.499844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.579780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.466488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.456136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.769425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.666488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.480138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:16:21 CEST)" was missed by 0:00:09.611122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.042710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.193520 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.205923 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.264285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.063187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.970129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.031954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.133353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.231597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.111157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.169685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.903816 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.910327 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.218796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.221197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.757854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.981603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.949986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.263455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.064552 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.100469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.218370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.013749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.951127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.010167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.943214 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.871913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.199654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.160084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.031096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.907422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.943054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.071465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.917818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.220732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.168476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.134794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.117821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:08.931480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:17:21 CEST)" was missed by 0:00:09.062491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.971501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.160373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.193064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.993266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.992029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.029247 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.898924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.960695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.147139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.062129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.039949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.098547 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.832689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.839159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.122367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.149978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.686676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.910377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.878776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.192251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.000206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.134747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.942518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.938977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.147636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.871979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.800747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.063540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.128497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.088878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.097228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.879968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.959893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.846608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.836262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.871884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.149548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.991209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:10.046582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:18:21 CEST)" was missed by 0:00:09.860236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.958685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.109509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.897487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.180248 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.980470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.979183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.886121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.947918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.134258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.929638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.049325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.147580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.027088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.085667 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.819797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.826336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.926082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.137163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.673839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.865944 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.179430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.121934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.016425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.867085 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.134793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.787883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.859010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.987407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.115634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.859181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.823391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.076039 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.947084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.136717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.833792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.050745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.084466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.847436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:11.033790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:19:21 CEST)" was missed by 0:00:10.978451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.018127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.168967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.239726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.038639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.075892 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.945584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.007374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.193750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.108793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.207034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.086570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.145137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.733300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.956992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.925409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.238878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.046831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.181362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.039962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.989129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.879321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.885844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.985578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.196614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.847342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.926574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.194284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.918657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.918492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.175124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.135523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.143870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.006542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.893252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.882904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.110222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.093259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:10.906904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.196193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:20:21 CEST)" was missed by 0:00:11.037920 - iteration 6630/ 159576 | consumed samples: 227200 | elapsed time per iteration (ms): 30197.7 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.797427E+00 | loss scale: 1024.0 | grad norm: 27869.081 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.793874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.773402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.924220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.712251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.936612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.994985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.795176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.831165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.700842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.762623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.948994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.744387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.864041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.962298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.841831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.900425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.641064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.740811 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.488559 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.602582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.994138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.802133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.890751 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.634588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.949515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.951904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.673896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.648474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.638102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.680699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.930374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.899138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.681857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.761808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.673772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.951401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.865466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.848501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.662155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:21:21 CEST)" was missed by 0:00:11.793164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.651796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.664224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.722580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.522807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.521477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.428432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.501026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.490239 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.676616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.591629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.689880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.628023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.368666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.216162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.330199 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.439892 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.721724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.558806 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.471988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.569451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.362196 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.468430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.677118 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.679535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.401493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.401336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.529728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.626734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.408337 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.657997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.618367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.409473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.489401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.376104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.365716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.679035 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.593061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.576101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.389756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:22:21 CEST)" was missed by 0:00:12.520771 - iteration 6640/ 159576 | consumed samples: 228000 | elapsed time per iteration (ms): 30533.5 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.645786E+00 | loss scale: 1024.0 | grad norm: 45122.213 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6650/ 159576 | consumed samples: 228800 | elapsed time per iteration (ms): 30095.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.575365E+00 | loss scale: 1024.0 | grad norm: 33729.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6660/ 159576 | consumed samples: 229600 | elapsed time per iteration (ms): 29459.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.526109E+00 | loss scale: 1024.0 | grad norm: 53212.486 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.756661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.778403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.684024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.695536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.978236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.745883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.882309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.932319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.883686 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.624336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.471845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.777210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.814435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.727692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.724129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.935170 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.663967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.919941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.617876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.665089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.932805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.657159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.977459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.785410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.847362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.945629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.825157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.585913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.657015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.848676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.913675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.831709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.874081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.907608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.631798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.621416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.645369 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.776357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.745123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:36:21 CEST)" was missed by 0:00:11.934725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.462053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.534699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.756253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.556466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.402375 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.555225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.523927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.713187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.473598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.592472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.710392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.661738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.395881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.443121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.710821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.249875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.435170 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.363906 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.441992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.755465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.697958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.723668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.660420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.603193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.502202 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.685585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.563443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.505762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.625405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.409811 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.399415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.435056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.691701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.652102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.626774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.609800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.712727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.523142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.423445 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:37:21 CEST)" was missed by 0:00:12.554482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.943541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.955027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:12.037922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:12.073934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:12.005379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.883841 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.731317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.987179 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.877345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.923493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.983629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.916663 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.845402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:12.044881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.916497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.891263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:12.091253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:12.133554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:12.004573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.880895 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.924625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:11.904904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:40:21 CEST)" was missed by 0:00:12.035926 - iteration 6670/ 159576 | consumed samples: 230400 | elapsed time per iteration (ms): 29914.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.445564E+00 | loss scale: 1024.0 | grad norm: 25396.175 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.498024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.431809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.438349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.509572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.592445 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.628425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.559882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.479079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.285841 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.399866 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.538151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.541719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.471168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.471020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.478001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.645698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.599434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.445779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.435392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.459333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.591286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.570807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.688064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.696424 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.559096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.792344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.590362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.746889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.734033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.759704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.697836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.727756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.746496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.791548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.749331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.661477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.639302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.662791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.721674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:41:21 CEST)" was missed by 0:00:11.748805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.149366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.169903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.207118 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.076778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.325028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.276400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.017056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.088247 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.178065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.312616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.370997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.138612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.306344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.120397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.217852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.010569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.116853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.300242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.325479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:10.864564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:10.978596 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.056673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.370157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.171201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.240071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.338323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.275097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.057844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.327905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.049891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.049731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.266791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.137800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.024509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.014133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.327400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.241456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.224478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.038134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:42:21 CEST)" was missed by 0:00:11.169130 -[2021-09-30 07:42:10] PULSE: tr8-104B is running for 3:50:03 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.195009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.122403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.215560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.184257 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.322031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.056169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.133920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.216817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.252791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.370719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.263502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.062722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.371127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:10.910201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.415794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.358300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.416623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.351984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.320731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.162505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.345905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.024258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.095368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.102330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.223765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.166077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.285729 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.383978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.103476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.373563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.095535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.070139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.270103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.059771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.287087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.183471 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.373040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.312458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.083752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:43:21 CEST)" was missed by 0:00:11.214792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.429549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.450079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.356957 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.368467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.651154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.487332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.418807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.605246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.556576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.290762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.297262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.650314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.592825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.451372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.586522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.555253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.498040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.580442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.258779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.336873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.520260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.618519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.337999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.397068 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.605720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.608108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.330087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.329920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.458332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.400633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.144819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.418007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.304704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.546984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.521679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.504720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.294376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.607647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.318358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:44:21 CEST)" was missed by 0:00:11.449385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.309161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.329684 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.236557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.472399 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.530757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.366927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.434845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.176861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.485254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.138359 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.248084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.216468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.529915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.330960 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.298420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.484856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.498127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.377656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.436222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.170384 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.276646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.460021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.487677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.024364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.209663 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.401200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.466154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.280235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.399867 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.184276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.209525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.487174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.217647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.173924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.337923 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.384256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.426586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.297605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.197881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:45:21 CEST)" was missed by 0:00:10.328901 - iteration 6680/ 159576 | consumed samples: 231200 | elapsed time per iteration (ms): 29985.8 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.389253E+00 | loss scale: 1024.0 | grad norm: 67101.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.249301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.424953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.188190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.307056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.317762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.376325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.110483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.216754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.156589 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.269875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.176784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.406269 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.220324 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.117030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.425402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.427801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:10.964480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.078542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.470110 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.278027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.400221 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.124395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.149647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.412621 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.470999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.238632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.438313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.375081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.157769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.114042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.271198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.340044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.427325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.149897 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.341426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.237768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.324450 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.366773 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.138119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:46:21 CEST)" was missed by 0:00:11.269135 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.091916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.112428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.019325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.149672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.270410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.030841 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.255179 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.313531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.081182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.267628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.217613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.160412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.953142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.000356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.959621 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.242803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.921138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.999219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.312696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.120646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.113772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.248920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.062982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.182630 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.280891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.219007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.059429 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.268059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.807168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.992448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.183976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.967054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.992303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.956711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.209345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.080372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.269978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:09.980656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.111688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:47:21 CEST)" was missed by 0:00:10.167057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.800920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.728357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.739869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.708228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.821507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.858698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.790215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.957916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.976643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.869433 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.927983 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.662153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.668651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.977075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.979458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.630186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:10.021734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.829648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.964259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:10.022562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.822790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.772012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.891646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.989898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.926684 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.768449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.951857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.516166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.701487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.701305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.893020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.709415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.676093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.918352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.789392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.665734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.979020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.876067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.689717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:48:21 CEST)" was missed by 0:00:09.820738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.676521 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.749131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.687981 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.769664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.738359 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.924774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.817557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.610290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.616774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.716548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.925198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.969896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.912383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.970716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.770932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.806898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.906087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.720111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.839788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.938040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.874840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.876154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.927633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.464309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.649621 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.624194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.578341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.656415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.927118 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.777855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.866515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.657563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.737534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.900027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.613843 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.649482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.841200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.637853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.824238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:49:21 CEST)" was missed by 0:00:10.768891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.658733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.731358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.952926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.751869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.858377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.599015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.670240 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.952105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.760067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.894618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.753127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.789119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.720593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.888302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.907031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.857023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.799845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.592539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.698828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.907444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.909856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.446550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.560559 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.631675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.638651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.823401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.806417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.702403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.822042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.848747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.920298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.639780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.882244 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.631856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.606439 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.909358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.719777 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.596119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.620066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:50:21 CEST)" was missed by 0:00:09.751092 - iteration 6690/ 159576 | consumed samples: 232000 | elapsed time per iteration (ms): 29807.2 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.385251E+00 | loss scale: 1024.0 | grad norm: 32704.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.229039 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.249565 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.156440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.286798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.090202 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.096717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.167960 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.449795 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.392287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.450654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.218319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.404728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.417980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.297513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.356076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.379907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.405141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:08.944218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.058251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.136356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.257789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.250867 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.386001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.200096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.319718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.354757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.196520 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.407554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.129556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.104145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.129397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.137512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.093767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.407061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.321117 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.304145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.346455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.217493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.117796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:51:21 CEST)" was missed by 0:00:09.248840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.330174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.257623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.269101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.551815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.350746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.457237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.197887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.387976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.505880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.455888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.398673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.237503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.550970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.493480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.352026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.319478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.301230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.519145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.191406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.297678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.506319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.508737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.045408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.159442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.230557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.358946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.422218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.487183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.405262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.420909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.238660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.481101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.230735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.205302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.447626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.318626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.194976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.218902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.508248 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:52:21 CEST)" was missed by 0:00:10.349935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.903697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.924212 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.831128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.842570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.079362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.972157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.030719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.764878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.771386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.124462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.066960 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.125335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.961468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.892992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.874718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.029412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.871152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.079811 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.082224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.618882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.732905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.810994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.932456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.925555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.995733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.060662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.994394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.092663 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.812145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.054583 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.778781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.768415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.804043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.081704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.978767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.804237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:11.021126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.892140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.792435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:53:21 CEST)" was missed by 0:00:10.923456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.004803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.932263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.226455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.943733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.025385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.131875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.062609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.994115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.073319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.866032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.872541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.972323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.180959 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.834055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.225622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.168133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.026671 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.161824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.975874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.193801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.130593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.913292 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.155734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.183379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.720045 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.033627 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.180532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.095547 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.905375 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.879953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.869585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.905210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.912217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.182871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.096931 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.122259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.993304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.079975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:11.893623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:54:21 CEST)" was missed by 0:00:12.024644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.209393 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.148263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.229928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.136845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.336413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.277859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.070584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.077113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.267180 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.176871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.430182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.431068 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.198700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.366377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.385077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.180416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.385536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.387951 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:11.924606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.038644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.116721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.238143 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.372696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.231258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.300127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.398382 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.117869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.360315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.109956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.084507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.109779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.387426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.335218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.074163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.326845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.197876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.301549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.284595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.098241 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:55:21 CEST)" was missed by 0:00:12.229241 - iteration 6700/ 159576 | consumed samples: 232800 | elapsed time per iteration (ms): 30231.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.368582E+00 | loss scale: 1024.0 | grad norm: 36497.315 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.894730 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.967348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.188924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.094345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.906213 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.987870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.025083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.956563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.142969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.035800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.835025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.145838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.188078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.996045 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.989155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.124289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.938353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.058019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.156262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.093036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.828537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.934825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.118218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.143471 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.682548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.867823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.796577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.867671 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.874643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.130637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.059384 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.042405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.084738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.875798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.955763 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.842429 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:12.145363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.832103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.856057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:56:21 CEST)" was missed by 0:00:11.987081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.678237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.901673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.689721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.772650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.808612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.721836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.612016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.618522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.718312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.466007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.651332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.580028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.651175 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.779585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.740139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.825871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.841514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.868180 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.659258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.615561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.639494 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.658173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.739266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.625941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:57:21 CEST)" was missed by 0:00:12.770524 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.115652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.017422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.127122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.246018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.159255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.278903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.049422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.055907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.155716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:10.903438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.210075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.188350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.096687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.339071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.088589 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.095574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.177562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.063359 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.217029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.305633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.176686 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.088828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.208855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.053003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.315397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.409081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.351564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.409957 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.345318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.263376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.364008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.077014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.377281 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.314084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.364492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.366918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.208029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.256853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.280458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:58:21 CEST)" was missed by 0:00:11.366398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.532525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.605152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.730795 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.732155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.466329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.544036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.472855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.781251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.825910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.768400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.826764 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.626946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.625696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.662926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.762095 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.780821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.576162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.673623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.572632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.755985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.783689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.320355 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.434381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.505477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.512461 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.594429 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.697184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.680185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.695838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.794094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.513569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.505661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.480229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.469889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.783177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.633927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.722556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.593608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.493836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 07:59:21 CEST)" was missed by 0:00:09.624853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.942377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.869814 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.881281 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.962909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.118036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.069407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.163141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.000163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.810088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.118503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.771602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.105645 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.164017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.068105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.803575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.093248 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.120919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.964218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.099365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.913434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.033080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.131338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.010909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.850848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.909911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.657598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.842929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.817498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.842760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.849726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.971147 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.931727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.017469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.034470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.059808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.930865 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:09.120433 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.807174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.831154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:00:21 CEST)" was missed by 0:00:08.962147 - iteration 6710/ 159576 | consumed samples: 233600 | elapsed time per iteration (ms): 29774.3 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.359943E+00 | loss scale: 1024.0 | grad norm: 39467.594 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.017020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.944447 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.192651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.878149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.193100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.955879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.037555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.173930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.987980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.085463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.144021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.984456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.074784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.925403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.195528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.732206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.917326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.237780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.180300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.238651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.038869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.884769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.167876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.892094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.846269 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.924336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.006351 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.107750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.206007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.142806 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.881732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.195027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.917582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.045808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.134442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.005491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.109195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.092210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:09.905900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:01:21 CEST)" was missed by 0:00:10.036886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.342814 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.270249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.518491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.210513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.281748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.400607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.469861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.204018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.518960 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.563595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.363401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.499794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.468548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.521366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.506129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.564457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.364669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.313873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.433512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.531784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.411346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.251259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.310333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.493741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.058045 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.172103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.243200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.250184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.332155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.434902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.460231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.331271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.243387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.217940 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.207597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.371647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.520879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.417948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.231598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:02:21 CEST)" was missed by 0:00:10.362589 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.051373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.123956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.344683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.287182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.144453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.299603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.250944 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:08.985093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.274794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.300026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:08.839123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:08.953137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.062848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.345580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.181735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.280900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.094968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.249653 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.192438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.032368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:08.991689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.091406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.302480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:08.998993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.024297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.301928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.145792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.113285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.312932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.024501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:08.988656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.031305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.216018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.214692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.199043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.152771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.241412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.112443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.012716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:03:21 CEST)" was missed by 0:00:09.143713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.831909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.904539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.031541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.765699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.772216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.126116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.962291 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.080204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.030209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.843455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.926341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.925090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.080647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.083064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.619739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.811859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.125301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.067815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.893815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.996563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.061504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.875595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.995227 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.093474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.973050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.812988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.872037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.055435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.805026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.779634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.733788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.804910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.979581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.021935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.892965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.769286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:09.082561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.933351 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.924238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:04:21 CEST)" was missed by 0:00:08.793253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.256390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.183823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.383393 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.117554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.124058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.478000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.276933 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.314143 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.432483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.434898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.195311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.477162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.419637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.278204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.413330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.432083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.445337 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.164838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.407267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:08.971582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.085616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.156738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.163701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.245709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.227451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.347088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.382146 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.324907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.223896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.156921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.131481 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.434391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.285177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.373810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.121159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.244838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.348505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.331541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.145196 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:05:21 CEST)" was missed by 0:00:09.276184 - iteration 6720/ 159576 | consumed samples: 234400 | elapsed time per iteration (ms): 29941.4 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.368979E+00 | loss scale: 1024.0 | grad norm: 43688.302 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.766853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.839463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:09.015108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.778358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:09.061071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.965134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.707150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.860031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.897249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.700679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:09.015577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:09.002745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.861270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.931483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.810500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.930152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:09.028427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.907954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.966522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.806948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.990372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:09.018003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.554702 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.668714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.746804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:09.060264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.828774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.996486 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.914526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.739978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.714593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.739895 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.747967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.704234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.868293 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.956896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.728190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:09.017506 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.859172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:06:21 CEST)" was missed by 0:00:08.827925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.122635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.050078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.344272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.180417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.248347 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.249703 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:07.990378 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.298791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.061613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.029981 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.144483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.143258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.298380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.311608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:07.983881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.273557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.301189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:07.837876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.343478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.285968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.112008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.214703 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.279662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.093747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.213387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.240063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.191206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.090178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.023198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:07.951928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.023079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.300704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.151468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.197736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.031172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:07.997792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:07.987430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.111128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.011408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:07:21 CEST)" was missed by 0:00:08.142390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.364303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.585879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.291718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.489952 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.491293 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.225470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.540394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.542813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.303218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.422072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.232017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.384885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.540002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.553232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.271634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.585088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.527587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.386122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.353607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.456307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.521268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.439318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.335354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.455007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.432821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.272748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.331805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.079513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.264813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.239398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.193562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.264676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.542311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.383987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.481713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.515224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.229059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.252999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.393125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:08:21 CEST)" was missed by 0:00:07.352750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.874956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.802393 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.736133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.096543 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.001980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.742656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.051052 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.932738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.053464 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.590161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.813919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.782272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.896772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.895543 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.050685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.063891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.000695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.750027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.095782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.038274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.864276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.031949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.846034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.965672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.943516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.842479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.025885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.775500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.704236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.775342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:07.052955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.903748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.967048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.992370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.783443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.739693 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.950076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.863406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.763743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:09:21 CEST)" was missed by 0:00:06.894739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.532991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.460397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.658638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.754574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.471915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.553557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.660017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.394180 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.709100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.711500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.696267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.554797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.590768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.689970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.708698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.721937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.400713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.248193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.753783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.522297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.624997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.504078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.601524 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.441455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.683901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.433509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.362264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.433378 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.440332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.711005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.608036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.623725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.500512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.408106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.397757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.421689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.561801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.552687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.650412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:10:21 CEST)" was missed by 0:00:05.521458 - iteration 6730/ 159576 | consumed samples: 235200 | elapsed time per iteration (ms): 29711.1 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.346693E+00 | loss scale: 1024.0 | grad norm: 42854.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.097493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.118028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.024911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.223165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.958672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.273584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.036392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.254435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.286408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.224509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.965183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.812682 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.318267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.260752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.319133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.155277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.273182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.068538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.166004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.005919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.064991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.248367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.276031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.972556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.926731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.997847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.004812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.119357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.086848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.189519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.172559 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.188201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.998026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.962238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.275500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.214914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.085934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:05.986203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.126319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:11:21 CEST)" was missed by 0:00:06.117224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.450677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.378088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.672283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.626771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.639574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.576379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.311882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.389606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.613952 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.472488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.471238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.508465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.626389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.577719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.318382 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.629225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.165882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.358009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.671457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.439984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.542721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.607668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.421747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.541398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.519205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.359137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.418191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.601598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.351200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.325759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.279939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.315419 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.351065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.628683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.479493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.525749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.568107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.439139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.470407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:12:21 CEST)" was missed by 0:00:07.339428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.574856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.502280 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.796468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.700550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.701884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.436052 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.753360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.596665 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.595426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.632622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.442577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.513812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.482150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.751013 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.290077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.475210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.795652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.738158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.564172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.666896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.731863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.750591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.545940 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.665575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.763855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.643406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.483311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.542390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.475376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.449977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.404133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.603648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.649927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.725833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.439632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.463574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.752910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.692301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.563328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:13:21 CEST)" was missed by 0:00:08.594606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.769942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.697407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.896976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.946044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.631151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.637657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.708901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.790528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.827740 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.895675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.948474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.485150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.990708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.933235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.991591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.945673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.838475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.737467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.599220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.670312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.677289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.947954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.791817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.759307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.862026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.926950 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.741040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.860683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.958933 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.678421 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.920885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.670500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.645060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.634694 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.798777 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.845064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.887385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.758425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.789704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:14:21 CEST)" was missed by 0:00:07.658727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.910448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.837843 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.771634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.132056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.968206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.037468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.086555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.849375 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.817748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.931008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.086149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.778148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.088970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.131235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.073730 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.899746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.067430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.881501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.001140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.099375 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.036187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.978959 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.818909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.877962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.625672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.810954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.739691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.810809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.088447 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.939239 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.932294 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.027842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.061397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.785551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.775193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:09.002534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.898891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.985561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.799219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:15:21 CEST)" was missed by 0:00:08.930210 - iteration 6740/ 159576 | consumed samples: 236000 | elapsed time per iteration (ms): 30348.0 | learning rate: 6.000E-05 | global batch size: 80 | lm loss: 6.353148E+00 | loss scale: 1024.0 | grad norm: 36346.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.409056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.336491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.587549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.630687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.536094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.348011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.316360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.429639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.466839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.584767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.270283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.276785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.629829 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.572348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.430898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.398380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.380132 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.499767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.477598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.376568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.585205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.124316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.309587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.238325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.309428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.566087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.598048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.534857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.317558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.587082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.437850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.397525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.560029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.284207 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.501185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.526499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.273859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.484206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.297863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:16:21 CEST)" was missed by 0:00:09.428866 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.694309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.819995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.821317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.555499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.561992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.633217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.714868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.752077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.621781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.870404 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.872823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.601620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.915075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.857565 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.915960 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.716099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.869994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.665342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.784997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.883215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.762802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.661793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.409517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.523527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.594646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.872282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.683658 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.786361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.851310 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.769353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.811666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.602783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.845248 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.594850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.569410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.559055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.723119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.682741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.583013 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:17:21 CEST)" was missed by 0:00:10.714015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.196646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.124090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.064334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.135555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.254411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.323688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.372767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.375160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.417411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.418295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.217219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.372351 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.385561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.322398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.057877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.164144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:12.911862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.097154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.025881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.103961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.359924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.218518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.185981 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.353654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.167714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.287330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.265159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.347605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.097023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.374650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.225442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.288709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.314059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.105138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.071789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.061409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.271762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.185092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.085401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:18:21 CEST)" was missed by 0:00:13.216399 - iteration 6750/ 159576 | consumed samples: 236912 | elapsed time per iteration (ms): 31367.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.339949E+00 | loss scale: 1024.0 | grad norm: 36682.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.748677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.948928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.891398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.949800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.749934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.785942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.728208 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.903826 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.699168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.818856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.917131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.853893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.796663 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.855203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.589359 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.695644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.904261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.443356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.628664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.557400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.667084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.635504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.906177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.655636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.717507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.885186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.845554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.636668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.595872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.716598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.879070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.906736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.603267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.628525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.756977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.820261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.747875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.803303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.592925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:25:21 CEST)" was missed by 0:00:05.616987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.471876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.414382 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.472768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.271673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.308914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.251168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.426786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.222167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.376796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.319620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.378167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.112318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.402025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.427252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:08.966342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.151629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.080341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.190062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.158449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.272993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.178604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.240477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.343156 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.408139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.270797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.326186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.341851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.440135 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.159617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.118895 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.218625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.429678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.126227 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.151460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.429174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.279948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.368572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.239600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.115927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:26:21 CEST)" was missed by 0:00:09.139879 - iteration 6760/ 159576 | consumed samples: 237872 | elapsed time per iteration (ms): 31713.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.366327E+00 | loss scale: 1024.0 | grad norm: 26158.355 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.715982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.373429 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.558721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.565524 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.680025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.585709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.629295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.775614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.519449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.566712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.625772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.533330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.487462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.558588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.687000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.647586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.677948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.525986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.522979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.597215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.646683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.733373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:27:21 CEST)" was missed by 0:00:13.547050 - iteration 6770/ 159576 | consumed samples: 238832 | elapsed time per iteration (ms): 31633.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.351589E+00 | loss scale: 1024.0 | grad norm: 32550.218 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.867814 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.786252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.023002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.973050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.010600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.818379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.915833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.676576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.068116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.708575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.814846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.023505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.974441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.562597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.747911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.747735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.905214 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.847500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.004387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.938126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.036369 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.755859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.998275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.754773 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.069085 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.869282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.774902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.836742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.939431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.922459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.025990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.722504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:05.025417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.715161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.964849 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.712182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.736151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.867089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.835870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:36:21 CEST)" was missed by 0:00:04.876261 - iteration 6780/ 159576 | consumed samples: 239792 | elapsed time per iteration (ms): 31040.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.392241E+00 | loss scale: 1024.0 | grad norm: 34799.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.685656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.828416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.604105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.840850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.790894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.841307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.380382 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.494405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.885941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.665292 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.636228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.792265 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.816079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.565719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.540272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.565561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.886894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.723037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.592705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.757243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.755958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.733726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.526425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.573667 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.632724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.572574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.843210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.687097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.654559 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.822213 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.854217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.843789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.529949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.740295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.553966 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.684915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.533032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.653683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.694079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:37:21 CEST)" was missed by 0:00:05.782695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.626797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.781927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.545200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.506649 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.577319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.674789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.782423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.827085 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.769578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.827969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.664129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.533796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.606400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.732045 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.733372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.573799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.321530 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.506845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.481398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.435531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.628187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.697073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.795315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.467533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.514805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.757228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.784900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.513681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.784332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.595657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.763341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.471078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.474130 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.635157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.698459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.723802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.594804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.681518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.626139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:38:21 CEST)" was missed by 0:00:07.495173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.137084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.279868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.174389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.044065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.116650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.292270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.207290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.305532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.185105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.243641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:08.977795 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.292721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:08.831772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.017093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:08.991656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:08.945823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.055541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.023919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.337378 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.338237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.138468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.105919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.087656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.242351 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.025055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:08.984337 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.084121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.267529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.295167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:08.981325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.016970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.294607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.208678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.273628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.234031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.105027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.145432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.005397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.191741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:39:21 CEST)" was missed by 0:00:09.136354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.766147 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.921297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.836319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.574843 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.645988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.967274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.716675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.871333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.921779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.646131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.684573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.966404 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.908920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.767510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.673116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.745725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.934587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.814155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.606880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.713144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.460854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.620740 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.803497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.734956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.837692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.863049 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.872731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.654136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.613376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.896580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.924243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.610402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.653027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.923666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.734053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.902697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.820749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.634410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.765336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:40:21 CEST)" was missed by 0:00:10.774503 - iteration 6790/ 159576 | consumed samples: 240752 | elapsed time per iteration (ms): 30867.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.349704E+00 | loss scale: 1024.0 | grad norm: 35833.809 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-30 08:43:20] PULSE: tr8-104B is running for 4:51:13 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) - iteration 6800/ 159576 | consumed samples: 241712 | elapsed time per iteration (ms): 30873.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.367813E+00 | loss scale: 1024.0 | grad norm: 41424.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6810/ 159576 | consumed samples: 242672 | elapsed time per iteration (ms): 30682.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.382545E+00 | loss scale: 1024.0 | grad norm: 34278.039 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.136121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.054487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.291245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.086629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.024089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.291760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.278910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.104955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.241381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.184134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.242739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.083104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.336434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.293641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.337342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.115762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.023022 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.137523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.206411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.294254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.016209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.043181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.272722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.304687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.266615 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.016063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.233115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.173520 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.207756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.190790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.144514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.104146 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.135412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:57:21 CEST)" was missed by 0:00:03.004478 - iteration 6820/ 159576 | consumed samples: 243632 | elapsed time per iteration (ms): 30306.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.347304E+00 | loss scale: 1024.0 | grad norm: 43929.548 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.709859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.628205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.864953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.660350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.852589 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.757818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.597788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.865462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.910120 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.816437 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.689475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.815111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.656851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.840267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.867357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.678714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.867961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.711244 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.846404 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.589932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.589761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.596769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.718211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.911086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.616910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.780152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.878400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.747243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.764516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.806857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.781506 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.677870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.518696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.404690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.709152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.564591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.578193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.550774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.554267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:58:21 CEST)" was missed by 0:00:03.557302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.861499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:06.016626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.779892 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:06.004250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.812023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.909475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.968053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:06.017105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.841086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.830315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.966732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.702227 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.749455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.556210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.716079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.670239 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:06.061802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.808514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.991930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.705725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.741363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.748373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:06.018990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:06.062714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.768537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.998039 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.931771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:06.030019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:06.019586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.741574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.862914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.898864 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.933092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.916120 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.958466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.869846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.708813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.829503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.860742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 08:59:21 CEST)" was missed by 0:00:05.729799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.885090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.803490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.040225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.027880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.864690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.835635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.933103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.773057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.832089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.040745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.693812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.085367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.991699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.015522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.579857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.771991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.990398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.725895 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.043201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.739728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.086338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.729376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.765027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.792166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.854002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.021657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.765209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.042666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.893455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.886550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.922484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.955422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.982094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:08.053673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.732445 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.956786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.939813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.853132 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.884442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:00:21 CEST)" was missed by 0:00:07.753511 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.001971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.157120 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.920383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.049953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.157588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.202250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.144764 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.981569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.952532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.696716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.108574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.842742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.889958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.948996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.132406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.856576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.810741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.159496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.203204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.003392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.039328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.909022 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.970858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.072279 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.160077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.882068 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.888870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.138558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.170536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.107296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.846270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.881924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.098959 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.010343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.073654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.849309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.969996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.056704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:10.001309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:01:21 CEST)" was missed by 0:00:09.870367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.858015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.776412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.905979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.808529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:13.000785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:13.013660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.837619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.745997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.805005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.988425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.552766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.712617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.666775 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.826874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.964620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.698798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.737945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.744920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.765078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:13.016123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.738091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.702293 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.859430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.895391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.994583 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.928322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.705331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.866391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.955019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.826040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.912734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.929740 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.857344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:02:21 CEST)" was missed by 0:00:12.726402 - iteration 6830/ 159576 | consumed samples: 244592 | elapsed time per iteration (ms): 31010.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.375375E+00 | loss scale: 1024.0 | grad norm: 40351.475 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6840/ 159576 | consumed samples: 245552 | elapsed time per iteration (ms): 30954.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.360943E+00 | loss scale: 1024.0 | grad norm: 42077.208 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6850/ 159576 | consumed samples: 246512 | elapsed time per iteration (ms): 30379.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.356625E+00 | loss scale: 1024.0 | grad norm: 36705.788 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6860/ 159576 | consumed samples: 247472 | elapsed time per iteration (ms): 30489.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.331403E+00 | loss scale: 1024.0 | grad norm: 28294.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.180514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.956184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.037832 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.192940 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.732498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.239002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.988327 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.984813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.193443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.017432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.085809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.144430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.892399 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.846588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.924710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.238106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.075164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.925840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.195936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.917901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.039216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.006698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.168283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.195360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.108138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.134772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.143140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.878625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.882088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.917738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.046183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.944903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.174428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.092473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.206374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.885166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.109485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.005855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:04.037097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:19:21 CEST)" was missed by 0:00:03.906189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.735114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.592426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.697608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.748014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.287106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.792660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.572006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.663926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.646945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.760887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.446988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.401161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.479259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.749917 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.629760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.699024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.433183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.480393 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.722858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.472290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.662681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.689295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.750500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.728975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.591610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.436697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.600750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.439710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.560384 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.460666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.793858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.472806 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.511281 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.561603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.748005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.594153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.499817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.543406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.640862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:20:21 CEST)" was missed by 0:00:04.539903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:06.059103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:06.013160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.776424 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:06.000790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.858068 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.906004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.552701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.712574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.859350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.895321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.837597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.808557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:06.013651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.738031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.744862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.765011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.826841 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.928264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.954876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:06.026509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.963270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.805032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:06.058350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:06.015529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.912603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.964665 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.698797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.746028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:06.016118 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.666825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.866336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.929625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.705280 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.988528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.702283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.737944 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.994608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.857245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.826015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:21:21 CEST)" was missed by 0:00:05.726316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.428820 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.227770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.370473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.382869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.146149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.207309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.282264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.178263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.332935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.275737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.383371 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:06.922442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.229083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.174748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.385809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.107754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.082333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.428050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.265074 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.134741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.196565 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.297987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.324622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.396224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.334345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.115735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.036534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.114618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.385278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.299315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.226916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.068508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.358209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.072013 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.107643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.236083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.364308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.075026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.195723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:22:21 CEST)" was missed by 0:00:07.095975 - iteration 6870/ 159576 | consumed samples: 248432 | elapsed time per iteration (ms): 30589.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.341326E+00 | loss scale: 1024.0 | grad norm: 33934.385 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.552252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.753390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.707384 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.600215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.707854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.470635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.695018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.589552 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.531803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.502770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.499233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.246953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.406784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.439081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.752558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.622491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.720743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.710306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.432289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.361027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.553630 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.459278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.521102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.649129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.657550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.658886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.393037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.396493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.709769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.560546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.606868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.440283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.682745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.623885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.399542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.432192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.520255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.688880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.551512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:23:21 CEST)" was missed by 0:00:07.420584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.184847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.983780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.902154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.126533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.138888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.031735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.139372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.678450 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.184034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.985064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.890742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.963339 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.952545 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.934260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.088976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.930749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.863757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.838333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.870604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.021083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.055300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.038302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.053997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.080654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.152249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.090400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.871776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.141827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.792546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.828018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.141275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.992076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.982935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.824572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.831039 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.114231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.851976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.951726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:06.863704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:24:21 CEST)" was missed by 0:00:07.120381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.598963 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.517361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.741737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.800073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.636237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.578496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.754111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.754557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.646963 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.756984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.293644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.453507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.485768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.600308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.505970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.567787 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.549487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.669191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.486937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.545966 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.479004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.799265 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.756478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.695827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.767461 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.705571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.439748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.729438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.407742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.443200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.478859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.607256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.653579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.704287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.446228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.670573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.566920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.735556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.467252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:25:21 CEST)" was missed by 0:00:07.598250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.714031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.632387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.856758 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.915134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.693565 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.869128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.761970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.869623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.408705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.914258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.751304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.621006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.664539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.820594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.601970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.661000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.872065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.568585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.522767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.600824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.715359 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.682833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.785559 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.768569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.784237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.810891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.882493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.819257 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.554756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.844470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.594050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.593900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.871538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.722310 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.681971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.558304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.850582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.713209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.561297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:26:21 CEST)" was missed by 0:00:07.582245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.329104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.471876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.530261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.247554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.484295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.484746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.023822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.183678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.137860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.529405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.377138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.459539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.486628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.279692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.276152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.497621 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.209182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.330519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.236149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.297990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.173365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.399409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.426030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.434442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.383761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.400755 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.176423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.297118 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.308862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.366606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.487344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.170035 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.217271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.197438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.435911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.216154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.328438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.209169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.337612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:27:21 CEST)" was missed by 0:00:08.465882 - iteration 6880/ 159576 | consumed samples: 249392 | elapsed time per iteration (ms): 30100.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.354124E+00 | loss scale: 1024.0 | grad norm: 26852.610 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.250111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.967422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.249253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.191750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.028587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.204157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.097002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.204640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.743724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.903581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.857729 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.050357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.086347 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.956009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.017839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.999553 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.119242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.217491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.154256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.937000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.996038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.207087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.929043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.935839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.120573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.103586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.145899 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.155632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.889789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.179468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.928924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.206544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.057324 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.896294 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.016978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.893297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.185611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:06.917256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.049187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:28:21 CEST)" was missed by 0:00:07.048263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.676444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.877544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.831585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.832018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.594856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.819212 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.655983 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.371106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.530978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.677777 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.713732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.724449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.834490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.563263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.876730 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.833923 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.583443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.645267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.626972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.746667 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.844930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.564444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.623460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.806870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.556485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.485230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.520644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.684716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.773319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.781740 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.783082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.517216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.556344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.731068 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.523727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.644410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.748086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.813057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.544744 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:29:21 CEST)" was missed by 0:00:07.675724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.191580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.392650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.334309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.346705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.109985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.391803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.171143 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.142100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.239563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.079536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.347190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.000310 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.192916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.138582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:07.886298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.078424 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.228908 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.098573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.160397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.261809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.349628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.071617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.046195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.288459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.360077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.296860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.298202 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.032346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.322029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.349105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.246203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.071461 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.199878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.038860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.159546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.035871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.328182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.263225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.059886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:30:21 CEST)" was missed by 0:00:08.190875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.676299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.877407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.819038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.655862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.831478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.564261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.530865 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.876564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.724317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.806723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.831954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.371024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.677661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.713624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.645127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.626871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.746545 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.844802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.781571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.782921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.517055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.623332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.834352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.556341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.485079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.594816 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.563164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.833834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.583318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.730894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.773183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.556193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.684603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.747910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.523590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.644275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.520583 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.812900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.675549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:31:21 CEST)" was missed by 0:00:08.544582 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.200561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.057890 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.258986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.213029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.976308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.037443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.945835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.213496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.752572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.912464 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.258149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.964870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.026702 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.112433 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.008442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.163116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.105891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.215950 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.937915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.866648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.944717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.059226 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.095231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.129439 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.128139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.154776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.226376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.164492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.898665 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.004924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.188357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.066177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.902163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.937792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.215427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.194464 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.057100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.905167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:08.025863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:32:21 CEST)" was missed by 0:00:07.926125 - iteration 6890/ 159576 | consumed samples: 250352 | elapsed time per iteration (ms): 29945.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.342242E+00 | loss scale: 1024.0 | grad norm: 27208.338 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.406880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.608008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.562017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.325290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.549633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.454868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.562526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.408220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.386471 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.357425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.294855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.353905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.101614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.261501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.215647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.607177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.444216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.313890 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.375724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.477144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.513502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.247634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.537324 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.564970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.286935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.293757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.503782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.575403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.512172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.286773 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.564422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.415199 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.543467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.461524 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.254180 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.478512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.374873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.251197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.406164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:33:21 CEST)" was missed by 0:00:09.275194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.409143 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.610239 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.564307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.457122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.327562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.551905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.388723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.297103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.564785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.103859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.609425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.410475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.377965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.249884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.356157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.316136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.359700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.479381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.506024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.577655 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.515739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.539612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.567223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.289183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.263754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.217917 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.296019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.446509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.289052 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.566698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.377113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.253443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.417470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.545720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.256427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.514531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.463859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.480828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.277531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:34:21 CEST)" was missed by 0:00:10.408514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.509556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.427883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.710624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.664635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.557472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.665102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.204185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.364050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.652289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.489080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.460024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.456501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.667557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.396363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.709805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.667001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.510857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.546836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.416506 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.478340 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.397523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.389563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.517784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.579783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.606394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.678011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.616172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.350304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.640002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.318314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.353753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.564137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.614824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.356799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.477491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.389440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.581133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.646128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.508788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:35:21 CEST)" was missed by 0:00:11.377818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.005572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.119613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.197747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.198848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.257907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.190919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.155167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.158188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:36:21 CEST)" was missed by 0:00:12.179233 - iteration 6900/ 159576 | consumed samples: 251312 | elapsed time per iteration (ms): 30553.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.401146E+00 | loss scale: 1024.0 | grad norm: 32945.971 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 6900 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-30 09:37:40,487] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step6900/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 6900 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 20695.40 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.586124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.443383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.331306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.361791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.323195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.644490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.444765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.598610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.513584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.611815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.548606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.549970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.284104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.599047 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.138072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.323412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.252140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.330233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.643718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.480714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.350438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.423019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.412271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.579901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.393979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.540263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.491431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.290636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.390453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.411318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.573810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.601456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.451701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.514988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.298053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.600969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.498044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.287715 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.311754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:39:21 CEST)" was missed by 0:00:04.442747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.719577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.641882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.689101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.495850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.681028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.801249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.751743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.871401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.969630 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.906409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.849202 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.907754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.956826 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.609943 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.688051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.943979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:06.002377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.838527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.780790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.956427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.648440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.681241 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.655792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:06.001546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.937728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.898112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.748257 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.809509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.802616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.769132 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.931652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.959292 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.645482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.958766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.708269 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.770101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.872803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.855848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.669525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:40:21 CEST)" was missed by 0:00:05.800526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.501973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.342686 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.420417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.381818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.644705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.703125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.539288 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.481531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.657207 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.572184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.598869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.670432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.607185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.550020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.608563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.389924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.632371 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.381988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.310732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.388797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.702279 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.510274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.503376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.409003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.470805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.573535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.452560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.349215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.657665 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.660064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.196734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.638521 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.556569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.449080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.469924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.356650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.346282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.370233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.501238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:41:21 CEST)" was missed by 0:00:06.659605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.012352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.073534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.199153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.141974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.934708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.973812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.980801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.236770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.295113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.094048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.131278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.249186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.044522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.164194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.262405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.200557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.981914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.249633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.788686 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.973997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.095379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.001009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.190885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.941223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.041034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.252056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.948581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.902788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.102277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.062833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.165566 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.230520 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.061905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.938268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.294368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.148603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.224460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.251592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:07.093268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:42:21 CEST)" was missed by 0:00:06.962297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.195381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.234505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.497423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.555850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.354711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.334255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.459883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.461243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.242597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.234710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.163440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.273150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.241514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.555017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.391995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.424929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.451605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.523141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.201955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.485088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.510353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.049425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.362994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.356099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.261725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.323526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.426243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.491219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.409275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.509961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.305315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.402785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.512789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.209341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.301807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.322649 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.199006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.353920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.222942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:43:21 CEST)" was missed by 0:00:07.512333 - iteration 6910/ 159576 | consumed samples: 252272 | elapsed time per iteration (ms): 32356.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.401971E+00 | loss scale: 1024.0 | grad norm: 36409.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-30 09:42:49] PULSE: tr8-104B is running for 5:50:42 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.114176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.954866 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.314483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.256912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.315284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.151477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.093713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.269383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.184394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.282600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.219366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.220724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.002094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.994176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.922901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.032589 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.994018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.001009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.122473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.115570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.021163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.083004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.250688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.168760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.064754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.211076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.162216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.961433 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.244568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.269853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.272247 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.808902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.185732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.061247 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.082094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.968837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.958492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:06.982398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.113422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:44:21 CEST)" was missed by 0:00:07.271821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.757591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.796711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.059643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.118043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.916930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.896462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.804846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.835345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.954210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.987138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.013809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.085371 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.022165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.964946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.023463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.072576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.611623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.796909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.725656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.803747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.117245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.918301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.823926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.885740 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.053443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.072164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.867525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.764183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.047343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.925211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.988495 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.884847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.075016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.771553 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.761218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.864033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.971564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.785177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:08.074528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:45:21 CEST)" was missed by 0:00:07.916218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.939705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.017400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.978846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.241750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.300161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.099055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.078572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.254220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.169256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.147003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.205594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.986941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.254664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.793717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.979021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.907749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.985865 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.299344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.100410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.136348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.006016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.067855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.235548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.049603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.946296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.229428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.107323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.195965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.267516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.204314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.046105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.257126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.170633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.066976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.953656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.943352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.153667 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.256613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:08.967315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:46:21 CEST)" was missed by 0:00:09.098330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.645069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.684204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:10.005499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.841672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.783900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.874563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.972821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.909592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.910946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.692311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.499091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.722805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.691169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.947200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.805796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.804475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.711373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.773197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.875905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.959602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.901293 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.852395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.651611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.772295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.960053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.962417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.684396 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.613195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.812677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.940927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.858969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.754974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.751469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.934871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.672608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:10.004797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.659044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.648687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.803628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:47:21 CEST)" was missed by 0:00:09.962011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.800258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.563503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.693115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.485889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.595692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.525041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.787956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.645242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.624732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.750392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.533136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.592174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.531993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.846322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.682503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.715435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.751782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.800850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.339908 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.525204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.453965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.552207 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.813661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.775628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.803263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.499787 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.845564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.653483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.716732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.742141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.492469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.646625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.781754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.489484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.614058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.613157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.513425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.699795 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.644447 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:48:21 CEST)" was missed by 0:00:09.802807 - iteration 6920/ 159576 | consumed samples: 253232 | elapsed time per iteration (ms): 30245.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.349960E+00 | loss scale: 1024.0 | grad norm: 31484.583 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.033137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.072311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.335249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.393581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.172022 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.262681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.080425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.079296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.192558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.229755 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.360912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.072467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.110952 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.347696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.289357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.299069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.039723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.193869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.099492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.161305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.297732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.160385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.348154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.350546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:09.887192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.329003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.143111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.240556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.322957 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.200804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.047118 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.001303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.036780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.392881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.350037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.264091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.247079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.139612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.060779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:49:21 CEST)" was missed by 0:00:10.191792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.267571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.582045 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.345287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.306738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.314845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.313721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.628017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.426979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.464184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.406454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.377453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.474894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.533494 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.582548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.121590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.306898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.569711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.428281 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.333896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.395742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.497152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.523813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.595370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.373958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.584989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.281497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.235709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.563422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.532192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.274187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.394826 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.557382 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.271172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.627298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.435254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.584472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.498551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.481569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.295255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 09:50:21 CEST)" was missed by 0:00:12.426235 - iteration 6930/ 159576 | consumed samples: 254192 | elapsed time per iteration (ms): 30563.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.359145E+00 | loss scale: 1024.0 | grad norm: 35548.200 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6940/ 159576 | consumed samples: 255152 | elapsed time per iteration (ms): 31024.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.335144E+00 | loss scale: 1024.0 | grad norm: 39326.813 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.049260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.743911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.967648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.192009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.250379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.050606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.028838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.185729 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.204432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.119468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.146139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.097271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.889954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.937172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.204873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.929296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.903817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.936102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.249599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.086617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.956275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.018123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.999837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.217732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.154563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.896518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.996312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.017183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.179664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.207335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.858028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.893496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.929107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.206823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.057604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.120949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.103986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.156072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:03.917670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:02:21 CEST)" was missed by 0:00:04.048654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.194785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.052078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.031581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.157204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.892739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.746727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.970453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.938868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.253176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.053425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.089349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.959056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.123554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.188535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.207237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.002608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.122287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.148956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.100061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.158594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.939973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.999070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.182450 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.207704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.210085 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.932058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.906627 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.860804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.931913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.252386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.060367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.020921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.106603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.220557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.899333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.896324 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:06.920269 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.209642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.051264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:03:21 CEST)" was missed by 0:00:07.020022 - iteration 6950/ 159576 | consumed samples: 256112 | elapsed time per iteration (ms): 31223.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.349543E+00 | loss scale: 1024.0 | grad norm: 44884.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.458290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.315595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.295097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.420744 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.010260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.233996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.515877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.516697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.316939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.352890 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.452063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.470759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.385823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.412479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.484052 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.363584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.422129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.156253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.203498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.262575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.283476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.445946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.471194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.473602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.195601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.124294 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.202402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.473126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.323858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.222596 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.284425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.387116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.370168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.266130 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.162834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.170188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.159859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.183815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.195442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:04:21 CEST)" was missed by 0:00:08.314815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.324515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.242896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.524816 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.467237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.525644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.361829 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.304070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.461000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.479678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.275046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.421416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.429679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.372519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.431079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.165197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.212417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.271512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.454876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.480111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.482550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.019186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.204527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.133239 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.168736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.204353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.211356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.482059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.332822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.325905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.231550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.293380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.396035 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.379092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.394757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.493004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.171797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.292467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.179091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.192740 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:05:21 CEST)" was missed by 0:00:10.323737 - iteration 6960/ 159576 | consumed samples: 257072 | elapsed time per iteration (ms): 30960.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.356559E+00 | loss scale: 1024.0 | grad norm: 43203.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 6970/ 159576 | consumed samples: 258032 | elapsed time per iteration (ms): 30979.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.375929E+00 | loss scale: 1024.0 | grad norm: 63431.383 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.423298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.373284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.316092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.410883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.469271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.268190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.247683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.218688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.365012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.374713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.156075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.215146 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.423763 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:03.962838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.148171 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.186544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.154988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.468475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.269520 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.305477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.175152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.237008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.339655 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.404650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.322695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.338407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.436616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.108860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.398538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.426218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.122747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.076916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.148010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.425723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.276459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.115433 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.236086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.112418 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.136382 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:18:21 CEST)" was missed by 0:00:04.267380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.590221 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.427047 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.602640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.495478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.554026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.288166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.335380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.577843 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.603075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.142166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.256234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.365916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.334311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.647764 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.648629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.448869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.447563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.484796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.354501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.583980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.398022 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.517742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.544385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.615991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.552714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.394493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.605548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.327526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.302070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.327322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.605072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.455781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.416360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.519071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.294785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.291768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.502125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.415473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.315774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:19:21 CEST)" was missed by 0:00:05.446784 - iteration 6980/ 159576 | consumed samples: 258992 | elapsed time per iteration (ms): 30847.1 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.376985E+00 | loss scale: 1024.0 | grad norm: 43276.292 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:06.004556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.954548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.897371 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:06.049693 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.992152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:06.050541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.849456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.828978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.920886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.985889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.799915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:06.017898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.955965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.690140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.737348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.796402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.979774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:06.005034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:06.007466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.544119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.729430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.767800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.729271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.736256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.857713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.850781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.886736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.756421 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.818276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.848605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.903956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.919662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.946304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.696720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.704016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.658182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.717580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:06.006980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.693710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:20:21 CEST)" was missed by 0:00:05.817390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.753128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.803544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.342636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.790729 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.849124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.648038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.627562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.719479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.784447 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.803139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.695972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.754523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.488658 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.535875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.778362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.528010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.456722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.566419 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.848278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.649378 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.685302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.554991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.702525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.598546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.718225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.744875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.816465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.595002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.806070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.502551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.527831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.534837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.805537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.616866 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.647184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.495285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.492246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.516178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.656330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:21:21 CEST)" was missed by 0:00:07.615954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.333345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.285439 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.440571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.390568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.485701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.428173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.486538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.264962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.391971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.126107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.173342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.415781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.441046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:09.980101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.094164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.203834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.165256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.172254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.286804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.322746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.192427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.356975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.421905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.235980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.355640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.382307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.232434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.443475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.165449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.140011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.254292 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.340003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.453919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.132713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.442977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.293719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.284642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.253371 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.129714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:22:21 CEST)" was missed by 0:00:10.153647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.693414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.486145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.533373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.340158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.525442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.563852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.532277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.646825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.682765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.552428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.595980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.715669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.742325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.592446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.489669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.513611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.525302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.614291 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.700007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.492717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.500061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.653733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.644659 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.613412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.776100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:23:21 CEST)" was missed by 0:00:12.454481 - iteration 6990/ 159576 | consumed samples: 259952 | elapsed time per iteration (ms): 30916.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.387246E+00 | loss scale: 1024.0 | grad norm: 65251.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7000/ 159576 | consumed samples: 260912 | elapsed time per iteration (ms): 31006.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.367493E+00 | loss scale: 1024.0 | grad norm: 44514.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) ------------------------------------------------------------------------------------------------- - validation loss at iteration 7000 | lm loss value: 6.309564E+00 | lm loss PPL: 5.498052E+02 | ------------------------------------------------------------------------------------------------- -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.331069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.310558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.218923 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.368322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.467491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.486192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.281573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.401232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.379036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.437564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.171736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.486661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.489067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.025703 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.249458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.217850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.473836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.532184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.332441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.427957 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.436314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.278066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.461481 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.211058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.185629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.139845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.175256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.210907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.531411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.488573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.339319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.238057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.299915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.499550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.178304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.298965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.385739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.402751 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.199430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:31:21 CEST)" was missed by 0:00:12.330416 - iteration 7010/ 159576 | consumed samples: 261872 | elapsed time per iteration (ms): 32510.2 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.334219E+00 | loss scale: 1024.0 | grad norm: 56886.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7020/ 159576 | consumed samples: 262832 | elapsed time per iteration (ms): 30617.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.347347E+00 | loss scale: 2048.0 | grad norm: 74987.419 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-30 10:42:55] PULSE: tr8-104B is running for 6:50:48 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.687378 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.486316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.556456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.591500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.644275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.180963 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.366273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.629044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.487594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.393243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.465853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.641493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.436858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.583154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.654695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.534318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.592863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.327008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.374260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.333526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.433308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.616702 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.641934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.330519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.404737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.373120 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.686600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.643855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.494604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.523616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.455122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.557858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.622847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.540872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.454219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.340921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.295094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.366235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.354605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:44:21 CEST)" was missed by 0:00:03.485573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.476046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.170699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.618775 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.677183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.383046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.455612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.631219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.426597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.546232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.524067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.582611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.316767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.606429 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.634084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.356057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.284782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.320264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.394503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.676331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.477401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.513352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.572910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.644467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.364034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.323272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.423067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.631699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.362880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.633571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.444908 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.612584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.581349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.443983 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.330651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.355987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.484380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.547722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.530743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.344430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:45:21 CEST)" was missed by 0:00:04.475423 - iteration 7030/ 159576 | consumed samples: 263792 | elapsed time per iteration (ms): 30894.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.334834E+00 | loss scale: 2048.0 | grad norm: 78559.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.962230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.032375 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.120215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.656909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.162477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.104996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.163335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.963569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.999526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.869210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.941772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.117432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.912776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.059087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.130661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.067471 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.010258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.909253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.092624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.842220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.770962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.806467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.880672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.849033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.119771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.970513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.931077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.068842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.802982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.850233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.809479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.930129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.117898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.816838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.033833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.016836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.842179 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:06.098810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.830534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:46:21 CEST)" was missed by 0:00:05.961528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.566922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.601944 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.654759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.191431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.376735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.383570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.697047 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.639513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.697858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.498072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.496805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.534058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.403721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.476321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.651956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.447321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.593626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.665167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.544782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.603355 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.337496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.384746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.343978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.443778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.464651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.627156 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.652415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.305500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.341000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.415231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.654306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.505076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.465571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.568362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.633308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.551341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.351373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.376682 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.365064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:47:21 CEST)" was missed by 0:00:07.496032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.622410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.752512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.695032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.753382 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.553622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.552290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.459253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.531823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.707448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.502820 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.649111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.720643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.600300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.392994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.682662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.710297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.246983 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.432272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.361003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.470734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.439094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.589611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.521113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.657533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.658862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.440241 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.399485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.499307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.520173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.707951 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.396521 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.432169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.560564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.688810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.406895 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.606919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.709877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.420576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.623926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:48:21 CEST)" was missed by 0:00:08.551593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.584246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.642592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.441508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.511656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.609907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.546700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.571872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.136169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.321488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.250225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.641762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.442846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.348474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.410311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.596728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.392091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.538386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.489543 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.548110 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.282244 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.329491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.288748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.388556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.409426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.597159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.599594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.285752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.309727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.359975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.328387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.478876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.421139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.513103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.578056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.496095 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.296122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.321416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.599083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.449884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:49:21 CEST)" was missed by 0:00:10.440759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.635453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.435242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.590387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.505390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.540417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.129919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.315187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.577988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.636325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.436564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.342198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.414843 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.385781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.532090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.603633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.483251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.541827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.323197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.282448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.403135 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.565612 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.590905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.593323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.243949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.353678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.404080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.506803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.571762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.489778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.275989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.382284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.289846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.279505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.303440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.315142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.592809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.472625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.322164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.443604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:50:21 CEST)" was missed by 0:00:11.434481 - iteration 7040/ 159576 | consumed samples: 264752 | elapsed time per iteration (ms): 30641.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.358383E+00 | loss scale: 2048.0 | grad norm: 92601.974 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.591452 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.286107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.471417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.791725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.734208 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.792537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.592783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.498428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.746639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.661618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.688301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.759857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.696654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.639479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.721830 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.400186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.509905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.571073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.560273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.542020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.432207 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.438676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.538503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.559368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.747123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.749530 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.435694 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.628821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.663044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.646033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.698094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.479492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.446064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.459685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.478365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.749018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.728047 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.471434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.599834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:51:21 CEST)" was missed by 0:00:10.590707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.931373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.966401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:11.062320 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.861249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.768197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:11.016401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.958068 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:11.029637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.701968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.555917 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.741204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:11.061508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:11.004012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.862576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.915753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.811812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.909266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.708443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.829125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.991645 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.669976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.779691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.840868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.830062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.932789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.967850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.749236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.808279 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:11.016924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:11.019338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.705481 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.729434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.997801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.860440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.715857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.741193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.748161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:11.018824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.898652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:52:21 CEST)" was missed by 0:00:10.869621 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.358869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.429013 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.053511 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.238840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.559142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.501602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.559980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.360201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.265859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.514058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.455723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.527308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.464078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.199613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.489209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.167570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.338478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.413430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.309436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.406904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.465497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.246884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.206082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.305921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.326783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.514507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.213457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.203103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.277348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.245754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.516411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.396238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.327684 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.430468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.495431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.516999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.227118 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.238844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.367231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 10:53:21 CEST)" was missed by 0:00:12.358126 - iteration 7050/ 159576 | consumed samples: 265712 | elapsed time per iteration (ms): 30276.4 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.377520E+00 | loss scale: 2048.0 | grad norm: 87491.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7060/ 159576 | consumed samples: 266672 | elapsed time per iteration (ms): 30001.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.403289E+00 | loss scale: 2048.0 | grad norm: 68469.471 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7070/ 159576 | consumed samples: 267632 | elapsed time per iteration (ms): 30071.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.391364E+00 | loss scale: 2048.0 | grad norm: 55736.037 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7080/ 159576 | consumed samples: 268592 | elapsed time per iteration (ms): 30456.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.409267E+00 | loss scale: 2048.0 | grad norm: 55717.380 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7090/ 159576 | consumed samples: 269552 | elapsed time per iteration (ms): 30600.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.388932E+00 | loss scale: 2048.0 | grad norm: 68003.016 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7100/ 159576 | consumed samples: 270512 | elapsed time per iteration (ms): 30433.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.426712E+00 | loss scale: 2048.0 | grad norm: 72453.474 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.318933 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.298487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.389117 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.424087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.013617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.425489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.206878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.461692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.474125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.159644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.520094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.320294 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.415808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.487383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.366977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.474585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.127712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.163148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.237418 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.519234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.225947 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.390453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.269513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.166209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.265975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.449356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.477025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.173549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.198825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.476453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.356308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.287795 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.455463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.373514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.286858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.199010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.187137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.205835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.327285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:22:21 CEST)" was missed by 0:00:03.318144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.624314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.767076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.531316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.603913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.779513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.694514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.730889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.465043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.433066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.824607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.825460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.574882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.729550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.672362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.512294 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.571331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.754710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.319059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.625720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.593161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.721210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.792805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.779993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.782416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.468572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.542805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.661691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.760869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.471607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.592258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.504394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.478964 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.504235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.695913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.492573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.511245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.781897 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.678963 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.632687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:23:21 CEST)" was missed by 0:00:04.623599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.361410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.340994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.516568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.431595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.056106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.561669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.504170 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.562532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.268414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.311944 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.529853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.409410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.467974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.202125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.308412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.491807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.170158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.279860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.362786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.458298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.249383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.517065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.519490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.216002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.205649 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.398786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.330273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.466709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.208696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.329359 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.241307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.518939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.433013 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.497970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.241499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.248338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.369764 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.229716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.416104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:24:21 CEST)" was missed by 0:00:07.360764 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.237357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.216945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.125296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.343912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:09.932054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.437625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.380133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.392560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.307580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.342613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.078077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.184376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.393002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.046113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.438516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.187922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.405870 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.285412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.091962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.105563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.155851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.117250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.274739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.144419 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.373896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.334285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.367775 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.395474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.081602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.394888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.238789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.206241 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.084679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.117466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.124287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.308982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.205343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.292010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.245737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:25:21 CEST)" was missed by 0:00:10.236648 - iteration 7110/ 159576 | consumed samples: 271472 | elapsed time per iteration (ms): 31101.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 6.580699E+00 | loss scale: 2048.0 | grad norm: 157912.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.796990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.776545 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.997225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.998076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.952138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.747515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.867159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.902177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.844986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.903541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.605715 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.939748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.798313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.703961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.893861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.965430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.637706 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.684939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.743968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.491674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.715431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.765786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.868511 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.644251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.952647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.955053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.677028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.651575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.641238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.676871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.834358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.933502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.764910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.927395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.665198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.683890 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.954505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.851570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.805327 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:26:21 CEST)" was missed by 0:00:11.796236 - iteration 7120/ 159576 | consumed samples: 272432 | elapsed time per iteration (ms): 28698.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 9.185058E+00 | loss scale: 2048.0 | grad norm: 180998.054 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.173083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.127135 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.042174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.077167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.078536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.172240 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.114748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.068860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.140437 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.019991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.102358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.127637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.130061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.129475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.043512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.108489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.009346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:34:21 CEST)" was missed by 0:00:03.026581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.259543 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.295915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.389641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.332119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.390509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.344594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.286243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.357846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.294609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.319760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.347402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.346874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.237432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.345033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.226731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.325897 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.189505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.169075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.260995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.244003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:03.884165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:03.998219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.030182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.096478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.036710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.140051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.044072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.069330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.033719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.190868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.077488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.158336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.136540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.107974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.157442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.069590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.076420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.197836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.057750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:35:21 CEST)" was missed by 0:00:04.188800 - iteration 7130/ 159576 | consumed samples: 273392 | elapsed time per iteration (ms): 30693.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 1.049975E+01 | loss scale: 2048.0 | grad norm: 55461.720 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.247847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.077026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.468561 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.411055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.469363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.269624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.268322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.175234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.423503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.338502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.374853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.108998 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.423914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:05.962972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.122857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.425783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.237102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.218851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.365151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.436770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.373546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.316332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.115542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.215324 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.398681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.426357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.148346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.112535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.186758 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.148136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.305659 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.339886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.404814 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.156300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.236245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.155188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.276628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.322921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.136531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:36:21 CEST)" was missed by 0:00:06.267579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.928266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.768954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.622946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.736974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.128537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.071010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.129398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.929653 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.835284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.907868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.083486 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.998493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.034839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.775507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.058639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.083883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.086338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.782846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.772482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.085763 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.897112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.878840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.025168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.096761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.033547 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.976333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.816267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.875334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.808343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.846786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.808124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.965655 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:09.064847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.896237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.815191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.936617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.796523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.999916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.982956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:37:21 CEST)" was missed by 0:00:08.927591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.355080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.334666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.460261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.049746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.163793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.555338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.497834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.556152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.356409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.262048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.323882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.426572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.510282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.305640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.425298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.451967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.403114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.461617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.195759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.485450 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.510679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.209659 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.199297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.273551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.234918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.512557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.491581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.523562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.243074 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.202325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.302112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.513177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.235100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.241979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.363420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.392457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.354326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.409672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.323043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:38:21 CEST)" was missed by 0:00:09.223303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.843838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.916468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.079706 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.138020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.938270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.936965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.092125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.007098 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.105380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.042145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.043467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.777620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.784109 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.883924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.631603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.745679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.137222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.073436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.887501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.033800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.984971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.904842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.067336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.092567 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.095006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.816962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.791507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.781160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.855406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.816775 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.094427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.905752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:11.008481 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.824926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.805114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.823840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.945262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.974318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.991547 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:39:21 CEST)" was missed by 0:00:10.936203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.931815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.838752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.911377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.086962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.036991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.740538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.132042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.074578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.132894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.933158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.882364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.002021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.979830 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.038352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.772517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.779036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.878827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.062174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.626495 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.850272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.089316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.969173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.900623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.986388 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.028704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.100300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.819831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.899749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.087457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.089904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.811854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.786431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.776082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.800014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.811711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.940149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.003408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:13.068345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.931029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 11:40:21 CEST)" was missed by 0:00:12.818738 - iteration 7140/ 159576 | consumed samples: 274352 | elapsed time per iteration (ms): 30916.8 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 9.133900E+00 | loss scale: 2048.0 | grad norm: 87682.088 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-30 11:43:48] PULSE: tr8-104B is running for 7:51:41 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) - iteration 7150/ 159576 | consumed samples: 275312 | elapsed time per iteration (ms): 29962.5 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 8.327563E+00 | loss scale: 2048.0 | grad norm: 19811.678 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7160/ 159576 | consumed samples: 276272 | elapsed time per iteration (ms): 29964.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.936829E+00 | loss scale: 2048.0 | grad norm: 35030.288 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7170/ 159576 | consumed samples: 277232 | elapsed time per iteration (ms): 30316.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.438504E+00 | loss scale: 2048.0 | grad norm: 19990.149 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7180/ 159576 | consumed samples: 278192 | elapsed time per iteration (ms): 29924.7 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.164587E+00 | loss scale: 2048.0 | grad norm: 15039.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7190/ 159576 | consumed samples: 279152 | elapsed time per iteration (ms): 30031.6 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.031236E+00 | loss scale: 2048.0 | grad norm: 32055.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7200/ 159576 | consumed samples: 280112 | elapsed time per iteration (ms): 29800.9 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.061478E+00 | loss scale: 2048.0 | grad norm: 18698.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 7200 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-30 12:10:41,017] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step7200/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 7200 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 18645.50 - iteration 7210/ 159576 | consumed samples: 281072 | elapsed time per iteration (ms): 31462.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.117988E+00 | loss scale: 2048.0 | grad norm: 19766.348 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7220/ 159576 | consumed samples: 282032 | elapsed time per iteration (ms): 29098.0 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.102581E+00 | loss scale: 2048.0 | grad norm: 43730.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7230/ 159576 | consumed samples: 282992 | elapsed time per iteration (ms): 29584.3 | learning rate: 6.000E-05 | global batch size: 96 | lm loss: 7.110014E+00 | loss scale: 2048.0 | grad norm: 69532.973 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7240/ 159576 | consumed samples: 284032 | elapsed time per iteration (ms): 30170.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.003487E+00 | loss scale: 2048.0 | grad norm: 15084.079 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7250/ 159576 | consumed samples: 285152 | elapsed time per iteration (ms): 30507.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.860502E+00 | loss scale: 2048.0 | grad norm: 15314.924 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7260/ 159576 | consumed samples: 286272 | elapsed time per iteration (ms): 30929.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.896826E+00 | loss scale: 2048.0 | grad norm: 24300.554 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.452348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.607510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.557564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.500340 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.261067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.652609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.595087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.147055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.522601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.582759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.296554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.370815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.453758 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.402955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.559021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.299646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.608022 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.306986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.653571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.489733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.359409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.432050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.451554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.549313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.620877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.399415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.610508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.339290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.609939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.421263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.523921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.506982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.293179 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.332472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.320618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.332325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.588986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.420352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.460775 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:42:21 CEST)" was missed by 0:00:03.340451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.964233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.907062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:04.001835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.859126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:04.014210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.929287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.965662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.706309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.989465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.553780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.713662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.703289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:04.059358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:04.060272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.896414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.766106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.838737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.858245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.699819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.806117 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:04.014736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.667826 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.777559 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.745965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.860490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.827957 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.930633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.995619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.913656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.809689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.956021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:04.027560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.747090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:04.017201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.739170 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.727283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.738989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:04.016651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.827040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:43:21 CEST)" was missed by 0:00:03.867476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.587864 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.482779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.637879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.530712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.589302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.177426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.625500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.520055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.554239 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.481885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.552973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.323469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.326956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.401184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.389752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.462379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.537286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.433315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.651204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.329997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.429790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.613164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.638385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.640820 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.337361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.291494 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.350915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.369605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.683048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.683945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.484137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.451587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.619267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.579691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.362818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.362644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.640318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.370760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.491097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:44:21 CEST)" was missed by 0:00:07.450716 -[2021-09-30 12:44:18] PULSE: tr8-104B is running for 8:52:11 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.939946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.882722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.941300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:10.035004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.977499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.834798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.872080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.814394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.989916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.904985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:10.003236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.675470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.965155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.990382 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.529440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.689354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.643484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.678969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.753201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.741808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.906301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.971295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.833931 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.889333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.785325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.682038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.781805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.992853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.702934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.714631 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.721619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.992326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:10.036007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.836179 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.931724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.722781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.802725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.843129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.803660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:45:21 CEST)" was missed by 0:00:09.714884 - iteration 7270/ 159576 | consumed samples: 287392 | elapsed time per iteration (ms): 31181.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.211740E+00 | loss scale: 2048.0 | grad norm: 24294.031 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7280/ 159576 | consumed samples: 288512 | elapsed time per iteration (ms): 31068.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.055096E+00 | loss scale: 2048.0 | grad norm: 19138.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.534288 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.641488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.592877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.181007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.404755 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.486394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.327038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.330554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.556576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.591607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.295106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.366188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.629150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.487728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.465993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.622860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.583252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.374336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.333605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.433357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.641997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.686677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.494656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.687563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.393413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.436918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.654848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.616821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.644429 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.340982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.373230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.643918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.455220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.454314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.366454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.354622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.523735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.557961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.541007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:55:21 CEST)" was missed by 0:00:03.485676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.707431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.814617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.766015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.500168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.354145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.577911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.659553 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.639084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.729680 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.827922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.503688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.546310 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.667735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.802304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.566526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.796004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.756372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.764723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.547479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.506715 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.606490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.815120 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.817521 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.514081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.468266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.539330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.859799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.817029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.860686 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.660883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.696819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.731058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.610041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.627405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.789948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.527710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.628375 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.539564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.714131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:56:21 CEST)" was missed by 0:00:05.658776 - iteration 7290/ 159576 | consumed samples: 289632 | elapsed time per iteration (ms): 31407.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.939375E+00 | loss scale: 2048.0 | grad norm: 42267.743 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.832788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.479475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.784887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.764392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.939968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.855015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.890003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.891348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.625509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.940421 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.629019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.703233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.671633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.927638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.881717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.672778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.632057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.731843 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.915222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.942848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.639390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.593571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.985145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.986032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.786224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.822125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.691874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.856358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.921341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.953297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.653016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.664685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.942361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.793123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.753707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.784050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.839427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.735391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.752769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:57:21 CEST)" was missed by 0:00:06.664898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.893461 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.952019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.540155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.763902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.845529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:09.000645 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.796039 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.915671 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.686172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.733430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:09.001104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.654226 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:09.045770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.988283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.825102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:09.013958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.692742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.792497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:09.003532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.700072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.689701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.725328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.732332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:09.046707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.846894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.752542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.917038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.982014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.942404 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.950758 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.975932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:09.003049 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.853785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.882825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.813431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.725597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.713752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.814368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.900158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:58:21 CEST)" was missed by 0:00:08.844801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.936011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.582682 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.888103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.043195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.958225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.994620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.043653 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.742593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.806475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.030833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.867674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.993290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.728772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.735280 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.835051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.018444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.696790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.732256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.984955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.056506 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.776057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.046130 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.774906 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.088369 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.089273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.925364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.795117 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.838632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.756283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.767945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.045601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.889476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.959617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:09.024600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.942658 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.855984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.768161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.896389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.856974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 12:59:21 CEST)" was missed by 0:00:08.887326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.294073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:08.940753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.246165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.401258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.352657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.164547 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.225707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.316325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.401723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.404144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.132933 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.388911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.196687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.351348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.086839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.134099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.100701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.054848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.090333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.447319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.317642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.382638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.343028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.414585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.093365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.193143 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.376540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.125984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.446441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.247546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.283434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.153160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.126186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.114349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.403676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.254444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.215008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.214061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.300755 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:00:21 CEST)" was missed by 0:00:09.245394 - iteration 7300/ 159576 | consumed samples: 290752 | elapsed time per iteration (ms): 30609.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.858105E+00 | loss scale: 2048.0 | grad norm: 18537.037 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7310/ 159576 | consumed samples: 291872 | elapsed time per iteration (ms): 31137.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.955675E+00 | loss scale: 2048.0 | grad norm: 113733.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.753866 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.624277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.705956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.861054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.811098 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.400580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.848671 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.585758 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.776151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.546645 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.593886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.652907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.861549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.514670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.550127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.707285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.743231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.685531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.656473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.863980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.560524 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.592786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.906225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.907119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.842455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.812543 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.553203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.836356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.714216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.612988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.777473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.802868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.874397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.586018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.863497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.674838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.705148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.760545 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.673876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:12:21 CEST)" was missed by 0:00:04.574192 - iteration 7320/ 159576 | consumed samples: 292992 | elapsed time per iteration (ms): 30994.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.981020E+00 | loss scale: 2048.0 | grad norm: 26106.934 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.122103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.014932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.885350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.967020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.037184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.072171 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.661632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.846830 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.109748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.004274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.946572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.073541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.807678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.854921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.913953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.097368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.122592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.821558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.775700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.811177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.167262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.168178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.968381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.103487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.917532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.063893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.814225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.125035 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.853831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.124522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.874033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.038531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.135452 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.975298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:06.021592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.934911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.847063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.835206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.966217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:13:21 CEST)" was missed by 0:00:05.935909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.037050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.907485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.989142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.683757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.059308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.829803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.797799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.131892 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.144287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.094337 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.095675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.144720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.833308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.868977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.189401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.026407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.968713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.939669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.877061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.936092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.147153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.843681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.875956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.086029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.157568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.836366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.146659 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.997412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.190344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.990542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.060676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.125637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.957036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.119564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.896206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.988345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:07.043736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.857368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.869251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:14:21 CEST)" was missed by 0:00:06.958063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.775601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.422284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.646024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.727669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.832847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.568341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.883250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.607480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.870423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.797876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.582215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.536376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.571829 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.927929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.764962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.707264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.882823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.834221 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.615628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.674651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.858058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.885707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.614509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.885188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.864176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.678199 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.574929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.595872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.729097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.799197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.782234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.824581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.896133 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.735986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.928918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.695606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.607790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.726887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.634794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:15:21 CEST)" was missed by 0:00:08.696621 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.784075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.430766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.654535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.715725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.891314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.891741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.736188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.773437 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.806345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.894163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.580325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.878936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.841366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.683124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.590705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.544875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.616025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.622972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.936433 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.937336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.737526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.686711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.833055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.842739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.576883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.583394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.866574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.893672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.643200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.624145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.744456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.705027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.807700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.872695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.904615 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.704077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.616238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.735381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.790771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:16:21 CEST)" was missed by 0:00:09.604406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.634020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.741644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.280687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.504463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.623339 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.565626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.741208 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.656268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.691271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.440594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.430216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.743554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.586157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.754491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.433291 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.533052 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.744087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.465939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.472882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.587455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.536618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.682969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.426802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.474037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.594360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.728906 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.787280 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.493119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.657605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.640670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.692678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.553987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.716517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.466138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.394887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.454302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.786440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.554962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.722641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:17:21 CEST)" was missed by 0:00:11.585322 - iteration 7330/ 159576 | consumed samples: 294112 | elapsed time per iteration (ms): 30678.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.983111E+00 | loss scale: 2048.0 | grad norm: 32152.816 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7340/ 159576 | consumed samples: 295232 | elapsed time per iteration (ms): 30566.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.061161E+00 | loss scale: 2048.0 | grad norm: 25364.748 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7350/ 159576 | consumed samples: 296352 | elapsed time per iteration (ms): 31127.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.986874E+00 | loss scale: 2048.0 | grad norm: 61040.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.263220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.370446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:03.909947 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.095093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.215350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.321842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.069818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.059465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.133723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.056001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.103249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.062506 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.370907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.358085 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.252622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.285529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.194915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.351817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.165837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.312188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.320564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.162297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.373363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.024075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.102162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.415602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.383766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.345748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.372844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.216745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.122372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.183270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.095434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.223639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.416589 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.184231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.286894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.269935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.083627 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:31:21 CEST)" was missed by 0:00:04.214652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.076180 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.028260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.183415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.134754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.722917 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.882779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.908048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.228481 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.170983 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.007825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.133454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.868929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.916177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.158636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.183853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.836954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.065558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.098459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.975254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.186284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.872460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.946688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.915089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.029657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.935252 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.164742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.978797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.196704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.875488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.229469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.185807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.036542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.099807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.082838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.125177 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.996207 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.908341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.997131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:05.896508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:32:21 CEST)" was missed by 0:00:06.027546 - iteration 7360/ 159576 | consumed samples: 297472 | elapsed time per iteration (ms): 30872.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.907494E+00 | loss scale: 2048.0 | grad norm: 24757.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.549595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.597572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.656129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.390260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.429392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.749829 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.692330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.586889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.529189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.704784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.619833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.437531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.396834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.679986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.705218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.244273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.404160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.358311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.393798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.436426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.551023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.456628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.686099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.718074 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.496640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.707643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.468072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.707125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.557886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.500175 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.646541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.750822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.429684 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.654949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.517578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.518500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.604322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.621309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.417985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:33:21 CEST)" was missed by 0:00:08.549014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.526475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.406296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.726701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.669218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.506056 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.681678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.596688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.631704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.574471 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.633026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.367171 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.682094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.221168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.381036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.335172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.413303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.563799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.433491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.414440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.373722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.473496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.656859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.684516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.370700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.444961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.598037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.663001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.477062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.623402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.694958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.534771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.527909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.581089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.406574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.727719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.495367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.494450 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.684065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.394743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:34:21 CEST)" was missed by 0:00:10.525771 - iteration 7370/ 159576 | consumed samples: 298592 | elapsed time per iteration (ms): 30931.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.814569E+00 | loss scale: 2048.0 | grad norm: 25779.450 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7380/ 159576 | consumed samples: 299712 | elapsed time per iteration (ms): 30624.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.809129E+00 | loss scale: 2048.0 | grad norm: 51701.306 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-30 13:45:00] PULSE: tr8-104B is running for 9:52:53 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.108150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.060248 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.166739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.754861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.978616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.215401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.215832 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.097533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.039822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.900937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.007211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.914802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.904417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.260523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.202995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.967228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.196738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.130453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.218286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.868995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.947072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.261431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.010792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.157138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.228710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.907480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.940083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.217762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.061635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.190685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.165499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.948233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.131807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.068579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.028222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.940374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.114873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.059500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:03.928528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:48:21 CEST)" was missed by 0:00:04.029202 - iteration 7390/ 159576 | consumed samples: 300832 | elapsed time per iteration (ms): 31148.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.028105E+00 | loss scale: 2048.0 | grad norm: 37826.451 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.504009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.551979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.610527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.198674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.704237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.646724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.483607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.659211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.574243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.344689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.634406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.659633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.358566 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.312706 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.348220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.422462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.541318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.411067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.640531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.351259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.451042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.662078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.383866 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.390859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.454609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.600942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.672500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.391985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.661562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.705267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.505459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.575624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.609324 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.384131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.512357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.472923 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.471990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.558689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.372323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:49:21 CEST)" was missed by 0:00:06.503349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.257841 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.316392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:07.904553 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.128276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.352631 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.209915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.365060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.050562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.064446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.018606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.096736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.410165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.247225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.116931 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.189499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.346394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.280134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.378362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.315104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.097841 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.057123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.156894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.340296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.365516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.367965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.054116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.089753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.281462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.264498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.160465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.306807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.367455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.411146 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.177851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.218243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.211343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.090015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.078137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.178812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:50:21 CEST)" was missed by 0:00:08.209157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.138383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.785069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.196969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.931093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.246025 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.248444 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.944972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.934618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.008841 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.977242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.090535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.127708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.070009 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.245641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.040979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.160669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.937669 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.037421 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.970275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.233267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.997481 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.161971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.226942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.187340 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.258908 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.978400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.899236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.290770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.195675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.247973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.970532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.098750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.291705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.220942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.091894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.059343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.089692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.145035 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:10.058417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:51:21 CEST)" was missed by 0:00:09.958678 - iteration 7400/ 159576 | consumed samples: 301952 | elapsed time per iteration (ms): 31451.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.861111E+00 | loss scale: 2048.0 | grad norm: 31731.800 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7410/ 159576 | consumed samples: 303072 | elapsed time per iteration (ms): 31458.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.841044E+00 | loss scale: 2048.0 | grad norm: 39362.034 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.529351 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.577368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.224083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.447831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.530742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.508975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.684640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.370067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.376639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.383984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.338175 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.373613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.672162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.566733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.599611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.636015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.417377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.659837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.685030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.687430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.409271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.416278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.729695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.730640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.436388 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.665946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.480010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.626289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.697883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.476420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.686912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.498307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.537691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.497391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.409486 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.634783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.601062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.584158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.397788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 13:59:21 CEST)" was missed by 0:00:04.528824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.710958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.758928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.405642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.866160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.551632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.565508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.519717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.629408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.853720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.748267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.690538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.816206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.817546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.866573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.911235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.847469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.598941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.657966 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.841418 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.869038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.555209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.590834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.597837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.712358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.617969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.782511 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.661551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.781215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.807861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.879458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.558254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.912211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.765593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.868499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.719286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.679901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.591100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.710261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.678982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:00:21 CEST)" was missed by 0:00:07.579234 - iteration 7420/ 159576 | consumed samples: 304192 | elapsed time per iteration (ms): 31766.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.036752E+00 | loss scale: 2048.0 | grad norm: 57125.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.206917 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.077352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:03.999626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.314139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:03.853661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.159038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.265540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.301781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.105975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.314632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.013592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.003223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.038819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.360181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.160328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.196332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.065964 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.295501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.327436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.264261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.317073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:03.967778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.045868 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.138650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.109594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.229282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.255925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.047008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.006276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.289465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.359353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.316592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.127901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.039113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.126999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.167364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.230601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.213702 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.027402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:08:21 CEST)" was missed by 0:00:04.158391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.223022 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.165816 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.224357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.958466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.812500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.119158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.117924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.155172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.024820 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.097457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.254287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.273040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.286267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.005790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.965041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.273473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.972403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.962042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.036297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.997648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.318160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.275378 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.260650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.319046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.086683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.189353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.068438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.188069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.214749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.064857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.085776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.248319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.275923 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.997905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.926618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.004709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.117089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.172430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:06.986094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:09:21 CEST)" was missed by 0:00:07.126208 - iteration 7430/ 159576 | consumed samples: 305312 | elapsed time per iteration (ms): 31674.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.992638E+00 | loss scale: 2048.0 | grad norm: 34955.246 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.266942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.137395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.361785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.219066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.198578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.355475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.374136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.387414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.324205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.325564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.059668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.165972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.374604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:08.913638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.073519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.027754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.063181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.105825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.419302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.376510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.420191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.220358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.256312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.125978 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.169551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.289223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.315881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.106952 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.066233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.349457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.377051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.098836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.187863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.290525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.273584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.099082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.186942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.087258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.227330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:10:21 CEST)" was missed by 0:00:09.218285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.112834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.983300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.064952 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.102189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.044459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.220064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.170105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.905563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.011880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.759568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.873639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.909109 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.951710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.265185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.207679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.201379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.015458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.135142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.161784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.233335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.171477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.952856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.912112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.195339 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.220523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.222928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.919486 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.944751 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.266131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.066290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.971909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.136436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.119499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.032831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.222459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.073211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.033776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.945001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:11.933166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:11:21 CEST)" was missed by 0:00:12.064177 - iteration 7440/ 159576 | consumed samples: 306432 | elapsed time per iteration (ms): 31588.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.848711E+00 | loss scale: 2048.0 | grad norm: 51298.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.158161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.804865 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.028627 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.265393 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.215438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.216781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.950909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.964779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.253006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.111562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.110310 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.147554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.089815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.180448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.207098 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.998168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.957455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.057209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.918999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.954430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.990043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.997065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.310555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.311419 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.017196 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.246710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.060785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.278679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.078144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.265871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.268293 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.267762 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.181773 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.990315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.118563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.079108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.240717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.164867 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:03.978519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:18:21 CEST)" was missed by 0:00:04.109544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.066879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.161681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.018972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.174103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.124095 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.125489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.859611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.866151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.713591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.873499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.827668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.863129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.937328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.898745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.219229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.020305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.056246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.925920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.998537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.090454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.155428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.969506 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.089159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.115818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.187377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.906897 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.965934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.174563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.177003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.905794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.176467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.220160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.986877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.149386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.027276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.987819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.018154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:06.073519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.899031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:19:21 CEST)" was missed by 0:00:05.887147 - iteration 7450/ 159576 | consumed samples: 307552 | elapsed time per iteration (ms): 31336.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.914678E+00 | loss scale: 2048.0 | grad norm: 81214.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.259005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.306968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:07.953661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.067711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.177412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.401727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.238621 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.414164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.364167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.365581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.099703 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.113578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.138836 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.459269 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.460193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.260350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.296329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.165989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.330529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.395523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.209583 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.329254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.355904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.427467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.146979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.106256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.206006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.389421 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.414659 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.417085 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.103229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.145876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.416554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.227862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.313591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.226944 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.139064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.127214 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.267370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:20:21 CEST)" was missed by 0:00:08.258259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.860493 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.908462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.701149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:12.016060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.555124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.715011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.778901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.740279 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:12.003230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.897769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.840065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:12.015673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.930712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:12.028930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.965704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.967059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.707701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.807477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:12.018477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.669224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.704646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.747318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:12.060761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.932041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.996989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.811055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.957381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.748432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.990910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:12.018007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.868780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.861898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.767527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:12.061748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.915109 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.828437 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.728733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.740622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.859760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:21:21 CEST)" was missed by 0:00:11.829396 - iteration 7460/ 159576 | consumed samples: 308672 | elapsed time per iteration (ms): 31883.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.237827E+00 | loss scale: 2048.0 | grad norm: 30428.587 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.614878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.838625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.920274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.075413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.025463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.968199 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.026808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.760928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.867226 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.075852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.774808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.728973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.764405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.120513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.063014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.921600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.957573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.899851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.056731 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.870813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.088711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.767483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.078327 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.800123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.807093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.121460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.827243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.990519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.808222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.050675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.017178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:06.077807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.889126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.991814 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.928594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.888230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.800364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.974890 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.788544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:28:21 CEST)" was missed by 0:00:05.919553 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.046337 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.989162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.635852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.141441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.083943 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.941229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.978502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.920801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.096358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.781905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.788385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.888195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.071592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.096786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.795741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.749925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.785349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.859617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.828027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.142414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.942575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.848205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.012723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.077705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.891759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.011422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.038095 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.109639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.047796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.829182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.099258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.821253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.821101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:12.098723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.949530 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.910067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.940419 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.995787 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.909138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:29:21 CEST)" was missed by 0:00:11.809421 - iteration 7470/ 159576 | consumed samples: 309792 | elapsed time per iteration (ms): 31887.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.109668E+00 | loss scale: 2048.0 | grad norm: 40029.722 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7480/ 159576 | consumed samples: 310912 | elapsed time per iteration (ms): 31169.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.894396E+00 | loss scale: 2048.0 | grad norm: 34114.080 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.735958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.535736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.583738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.230418 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.454157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.678466 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.515347 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.376455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.344452 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.379930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.736902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.537075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.573065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.442727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.672270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.690956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.632647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.640974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.642337 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.382969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.482761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.666158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.691387 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.693823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.415782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.415616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.422586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.693233 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.504590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.607300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.486354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.606020 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.704221 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.423781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.390374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.544080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.503712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.590370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.404036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:39:21 CEST)" was missed by 0:00:05.535023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.339903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.210336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.492204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.434714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.493091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.291991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.329238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.447114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.388817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.397188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.398504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.132636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.238938 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:08.986635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.100697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.178761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.449442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.293312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.198932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.271558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.428453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.242505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.362192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.460391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.179913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.139184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.422381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.447599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.450002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.171998 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.136170 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.171795 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.300251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.363527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.259904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.146555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.260834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.346606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.160258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:40:21 CEST)" was missed by 0:00:09.291257 - iteration 7490/ 159576 | consumed samples: 312032 | elapsed time per iteration (ms): 31726.5 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.827368E+00 | loss scale: 2048.0 | grad norm: 42583.979 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-30 14:45:07] PULSE: tr8-104B is running for 10:53:00 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.242072 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.112500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.394368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.336877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.194188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.299348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.002861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.231446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.173735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.349326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.264361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.291016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.362565 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.300719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.034858 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.041344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:06.888848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.038347 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.195487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.141164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.324568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.349790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.352193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.074202 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.074008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.080986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.351662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.395336 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.101135 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.330665 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.144718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.082157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.202463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.163029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.265707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.162082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.048785 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.248772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.193412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:46:21 CEST)" was missed by 0:00:07.062457 - iteration 7500/ 159576 | consumed samples: 313152 | elapsed time per iteration (ms): 32352.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.763290E+00 | loss scale: 2048.0 | grad norm: 38763.027 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 7500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-30 14:46:05,312] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step7500/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 7500 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 17963.88 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.997101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.889911 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.760370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:08.042262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.948538 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.536653 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.984770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.842062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.788984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.997641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.947222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.972410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.913514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.978510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.792542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.912262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.682697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.696613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.650761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.686219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.999505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.821637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.896579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:08.010477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:08.000107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.721910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.728910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:08.043211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.843440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.938942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.689299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.850343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.730039 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.809962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.722107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.810912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.710256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.879430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.749089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:49:21 CEST)" was missed by 0:00:07.841264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.877724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.770546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.641012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.417302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.922937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.673139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.829216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.669616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.878274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.722752 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.792886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.563373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.880709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.577246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.865455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.923820 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.760003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.702253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.853078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.531409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.566887 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.880151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.819588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.891147 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.730968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.859208 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.610688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.569954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.602588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.690615 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.602753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.609586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.724071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.629725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.828011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.691552 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.794328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.777354 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.722026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:50:21 CEST)" was missed by 0:00:11.591012 - iteration 7510/ 159576 | consumed samples: 314272 | elapsed time per iteration (ms): 34140.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.770796E+00 | loss scale: 2048.0 | grad norm: 46758.531 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.386152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.032851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.538472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.338268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.474701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.493368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.444749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.178869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.493810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.192786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.256682 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.495697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.480996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.375563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.443465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.285225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.146995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.182423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.539397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.317831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.288765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.408468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.435134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.506711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.468655 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.496291 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.218089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.225094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.346533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.339622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.307080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.226232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.218318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.392856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.185529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.306178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.245276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.409810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.206527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:57:21 CEST)" was missed by 0:00:05.337546 - iteration 7520/ 159576 | consumed samples: 315392 | elapsed time per iteration (ms): 31701.0 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.777987E+00 | loss scale: 2048.0 | grad norm: 55357.791 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.180977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.028699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.899141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.123514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.980793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.135903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.085931 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.087300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.821441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.136344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.675412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.181912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.117258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.931266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.050990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.149213 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.927755 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.111150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.835323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.789491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.982149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.018107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.960369 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.052271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.035307 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.077670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.868745 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.824997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.860645 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.867646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.138284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.989086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.887799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.949592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.828060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.948682 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:10.138861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.860817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.848967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 14:58:21 CEST)" was missed by 0:00:09.979983 - iteration 7530/ 159576 | consumed samples: 316512 | elapsed time per iteration (ms): 31994.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.956472E+00 | loss scale: 2048.0 | grad norm: 80175.951 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.781955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.840467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.652361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.934205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.876699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.733999 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.889161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.839151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.574668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.428636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.542687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.935115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.713580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.870443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.684534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.804182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.902437 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.864350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.588583 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.613832 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.735348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.771300 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.640967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.805496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.788521 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.830871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.621939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.581236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.681014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.889635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.892055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.614032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.891483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.742278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.578243 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.620884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.701930 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.602194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.702828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:05:21 CEST)" was missed by 0:00:04.733228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.224847 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:09.871516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.095289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.377111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.319621 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.176932 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.083888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.332064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.283408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.017563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.031412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:09.985594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.378034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.214203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.156479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.313389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.127426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.247101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.345365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.282143 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.123907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.307262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.332504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.056789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.334362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.185180 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.248450 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.273806 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.064860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.024166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.144809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.334973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.021127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.063796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.178334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.231509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.056989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.145772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.045145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:06:21 CEST)" was missed by 0:00:10.176197 - iteration 7540/ 159576 | consumed samples: 317632 | elapsed time per iteration (ms): 32172.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.038902E+00 | loss scale: 2048.0 | grad norm: 93572.879 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.582708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.229359 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.389228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.734973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.534788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.689919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.641272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.375415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.690316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.453181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.677503 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.735882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.572067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.514337 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.671232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.481734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.692819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.343467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.378929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.692237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.536091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.441736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.485304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.605001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.640007 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.422725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.665153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.414620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.543048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.631690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.703256 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.382024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.414804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.421640 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.503620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.606346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.589390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.502707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.534041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:13:21 CEST)" was missed by 0:00:04.403038 - iteration 7550/ 159576 | consumed samples: 318752 | elapsed time per iteration (ms): 31787.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.987051E+00 | loss scale: 2048.0 | grad norm: 75612.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.940049 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.739892 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.894994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.787812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.846357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.434488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.594367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.658274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.882600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.940968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.777160 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.646850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.719431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.876335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.690395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.845063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.580525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.686845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.870228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.895457 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.897923 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.548572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.748136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.741195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.708697 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.811412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.794439 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.810106 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.908343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.627828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.619905 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.584099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.619747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.626706 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.897377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.739080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.836799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.587153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.707779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:14:21 CEST)" was missed by 0:00:07.608126 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.840818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.688556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.335216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.783369 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.640633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.795759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.747154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.481308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.770989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.796200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.495138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.449314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.559054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.841750 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.620212 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.591165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.710852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.528579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.587595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.484828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.798114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.641985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.677957 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.777121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.745877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.798710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.520518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.527507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.648913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.547646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.737551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.809128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.520685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.609478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.695257 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.487922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.712226 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.608574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.639938 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:15:21 CEST)" was missed by 0:00:09.508935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.009155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.655801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.815691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.103955 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.961232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.067725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.801873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.908163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.116780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.769896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.879643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.998518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.097675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.911734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.849162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.091591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.805392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.962568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.868201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.940808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.032718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.031459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.119282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.848070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.969490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.841261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.841098 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.930058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.058141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.808490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.960431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:11.015819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.929165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:16:21 CEST)" was missed by 0:00:10.829463 - iteration 7560/ 159576 | consumed samples: 319872 | elapsed time per iteration (ms): 31187.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.843769E+00 | loss scale: 2048.0 | grad norm: 42259.985 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7570/ 159576 | consumed samples: 320992 | elapsed time per iteration (ms): 31466.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.815006E+00 | loss scale: 2048.0 | grad norm: 82532.379 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.175148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.281670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.029693 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.093569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.317892 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.330348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.280384 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.223135 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.015856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.330763 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:05.869824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:05.983860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.375410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.212483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.311637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.125732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.122159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.305548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.333238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.062017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.183451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.154778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.245415 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.063135 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.019391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.055043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.176533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.272091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.343674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.376410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.246720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.022459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.332728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.144061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.229808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.143122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.055277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.082225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.043451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:25:21 CEST)" was missed by 0:00:06.174420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.795023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.947322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.889804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.747119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.852274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.587781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.877448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.902668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.441756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.601610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.555769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.665510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.755353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.784396 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.726654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.883564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.902279 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.853659 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.635050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.694096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.905141 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.626958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.633921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.818619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.697679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.817355 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.915578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.591312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.904617 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.948309 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.748496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.654149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.746282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.801688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.844041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.594398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.615334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.715034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.716012 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:26:21 CEST)" was missed by 0:00:10.627222 - iteration 7580/ 159576 | consumed samples: 322112 | elapsed time per iteration (ms): 32074.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.805882E+00 | loss scale: 2048.0 | grad norm: 58909.807 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.738477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.786410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.844989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.579105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.894006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.433089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.592961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.656877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.938702 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.881185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.775786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.874914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.893627 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.868844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.547174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.582611 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.618321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.895954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.746744 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.718069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.689019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.808712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.626424 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.685459 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.896542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.739844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.906967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.625379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.939720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.835423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.843788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.706428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.585774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.645535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.618606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.707402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.810136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.793221 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.606882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:33:21 CEST)" was missed by 0:00:03.737846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.107113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.214710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.913629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.977552 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.259413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.201880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.059191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.096440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.038713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.214331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.164360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.165694 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.899813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.189532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.217189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.753775 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.867846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.903308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.216623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.067407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.260349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.195619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.009714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.129406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.227620 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.947104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.006147 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.939016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.946010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.156083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.027089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.060562 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.966172 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.130724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.906446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.028060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.113778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.939263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:08.058403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:34:21 CEST)" was missed by 0:00:07.927444 - iteration 7590/ 159576 | consumed samples: 323232 | elapsed time per iteration (ms): 31600.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.841225E+00 | loss scale: 2048.0 | grad norm: 74076.037 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.742670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.850250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.549220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.613122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.894965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.837452 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.694767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.732005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.849893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.801289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.535411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.641712 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.825114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.852767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.389339 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.503431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.538881 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.852190 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.702990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.895931 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.674332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.831211 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.645282 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.764947 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.863186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.582703 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.662626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.574613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.601744 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.791638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.542005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.581619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.696137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.800034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.574828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.663638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.766369 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.749411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.563077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:35:21 CEST)" was missed by 0:00:10.694051 - iteration 7600/ 159576 | consumed samples: 324352 | elapsed time per iteration (ms): 31513.8 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.037973E+00 | loss scale: 2048.0 | grad norm: 47145.409 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.971652 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.078849 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.028927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.079295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.081739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.618362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.778263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.123970 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.931973 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.066483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.923802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.961006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.903290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.060204 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.030276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.764403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.870693 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.732451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.842186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.803558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.810557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.081236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.874262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.993976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.020624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.811689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.767929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.124968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.925108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.995259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.978298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.092230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.771001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.891664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:06.054183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.791979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.830756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.803846 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.922980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:43:21 CEST)" was missed by 0:00:05.892662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.471069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.623360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.460392 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.402675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.578286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.263809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.578708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.117793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.277654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.431361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.565893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.423224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.520014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.528352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.529676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.370114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.581156 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.231855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.341601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.309941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.494668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.559628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.373677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.493366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.591574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.311090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.553571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.267353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.303006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.580664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.624353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.477709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.270410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.391033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.424556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.330139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.291401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.303259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.392054 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:44:21 CEST)" was missed by 0:00:08.422405 - iteration 7610/ 159576 | consumed samples: 325472 | elapsed time per iteration (ms): 31590.7 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.990118E+00 | loss scale: 2048.0 | grad norm: 84067.092 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.816017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.936779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.979031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.787026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.921548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.778888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.758332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.933940 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.826742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.885334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.619473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.725783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.934393 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.473472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.633328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.587505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.697249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.665604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.980015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.915295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.729329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.849057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.875711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.947257 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.666791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.909209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.623006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.658683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.936333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.780202 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.884099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.626096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.746725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.685815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.833441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.658902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.850422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.747720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.647139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:45:21 CEST)" was missed by 0:00:10.778132 -[2021-09-30 15:45:18] PULSE: tr8-104B is running for 11:53:11 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) - iteration 7620/ 159576 | consumed samples: 326592 | elapsed time per iteration (ms): 31497.5 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.023246E+00 | loss scale: 4096.0 | grad norm: 134915.030 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.575061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.669857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.527174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.506691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.682294 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.367851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.682739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.221812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.727402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.564447 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.633692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.474103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.381676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.335878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.371322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.445599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.406975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.535424 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.728323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.477693 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.632410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.415093 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.657559 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.684692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.528510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.663661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.597396 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.695618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.685175 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.624091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.407228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.414087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.434134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.598699 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.374462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.495076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.496033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.581778 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.526455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:55:21 CEST)" was missed by 0:00:04.395475 - iteration 7630/ 159576 | consumed samples: 327712 | elapsed time per iteration (ms): 31050.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.049447E+00 | loss scale: 4096.0 | grad norm: 100552.823 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.342246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.494595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.437059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.294395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.331637 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.273874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.449462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.462765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.400859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.135038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.182235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.424732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.449958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:05.989016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.148832 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.103101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.212779 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.174187 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.302606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.430833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.244854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.399608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.241290 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.452346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.138551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.181237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.451882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.495545 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.295754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.365898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.364581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.391255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.201321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.348968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.141643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.262259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.174475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.263264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.293613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:56:21 CEST)" was missed by 0:00:06.162624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.675850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.607451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.783066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.733121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.783555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.770662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.628008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.665230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.699370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.734498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.468673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.574893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.758356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.322626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.482425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.436698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.546419 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.828205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.636222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.682474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.578480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.698167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.796356 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.515886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.785960 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.472148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.507805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.829176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.534906 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.764496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.627108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.724851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.475192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.595845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.496109 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.514845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.785480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.629360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.596851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:57:21 CEST)" was missed by 0:00:08.508042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.890499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.958907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.066092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.053709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.911078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.948277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.017532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.751704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.066619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.605678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.719720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.111227 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.919265 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.982487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.861501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.016258 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.857938 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.041416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.765522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.829446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.112208 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.981216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.079426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.798948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.069029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.755197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.790862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.797879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.068532 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.817948 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.007900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.912408 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:11.047530 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.758299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.965602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.878898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.879889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.791089 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.910247 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 15:58:21 CEST)" was missed by 0:00:10.779314 - iteration 7640/ 159576 | consumed samples: 328832 | elapsed time per iteration (ms): 31205.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 7.008932E+00 | loss scale: 4096.0 | grad norm: 89772.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7650/ 159576 | consumed samples: 329952 | elapsed time per iteration (ms): 31008.6 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.867789E+00 | loss scale: 4096.0 | grad norm: 110491.968 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.887918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.837919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.679730 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.888345 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.427434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.933035 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.741031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.875472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.732830 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.787270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.901236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.780751 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.839334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.573474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.890782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.541501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.651217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.612614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.934001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.770123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.712390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.804255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.869260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.683325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.803010 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.829676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.620756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.580026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.700679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.863194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.587361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.576988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.890314 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.734200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.639833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.612883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.600987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.619688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.701663 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:09:21 CEST)" was missed by 0:00:04.731996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.640827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.641264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.493928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.628406 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.485746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.465235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.590907 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.533651 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.592253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.432650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.643695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.180372 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.404116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.685935 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.436218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.654147 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.326420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.616090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.294443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.329886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.643223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.523028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.557242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.540263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.555972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.373674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.340306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.365560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.686972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.622194 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.582642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.332993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.372580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.453632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.487157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.392788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.484961 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.353963 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.454643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:10:21 CEST)" was missed by 0:00:09.365877 - iteration 7660/ 159576 | consumed samples: 331120 | elapsed time per iteration (ms): 31605.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.823728E+00 | loss scale: 4096.0 | grad norm: 93872.271 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.668017 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.775247 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.538484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.775725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.620181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.314812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.570643 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.726687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.567087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.762906 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.778195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.725380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.460883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.428856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.499977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.820436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.628413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.821353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.657485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.599741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.690397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.788595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.750558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.474717 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.464346 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.777666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.621517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.527151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.756631 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.674726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.717051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.508129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.588053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.691707 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.467434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.488404 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.507088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.500265 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.589044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:16:21 CEST)" was missed by 0:00:07.619401 - iteration 7670/ 159576 | consumed samples: 332400 | elapsed time per iteration (ms): 32554.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.805772E+00 | loss scale: 4096.0 | grad norm: 132566.102 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7680/ 159576 | consumed samples: 333680 | elapsed time per iteration (ms): 32545.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.017227E+00 | loss scale: 4096.0 | grad norm: 105333.215 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.034700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.905124 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.141901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.681470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.986896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.827524 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.142401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.144842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.937321 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.093379 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.933767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.129591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.966414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.841386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.187082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.188016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.024167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.058326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.057051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.155273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.092080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.117223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.866646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.144361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.995104 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.988201 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.123291 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.083724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.874798 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.834074 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.795605 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.831033 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.893857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:06.041449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.954770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.873753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.866939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.955749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.986134 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:22:21 CEST)" was missed by 0:00:05.855151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.739803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.610253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.847019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.798422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.834648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.849934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.386573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.692004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.642422 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.762110 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.532616 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.638879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.847508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.546475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.500615 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.892182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.893112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.729244 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.671507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.828351 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.788774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.860343 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.797163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.579851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.822311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.536127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.571728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.700208 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.693312 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.598922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.763462 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.539165 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.849455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.746547 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.659854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.572004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.578833 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.660809 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.691218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:23:21 CEST)" was missed by 0:00:09.560222 - iteration 7690/ 159576 | consumed samples: 334960 | elapsed time per iteration (ms): 32896.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.019022E+00 | loss scale: 4096.0 | grad norm: 102136.212 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.855837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.748695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.647684 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.619131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.700793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.395418 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.738088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.807337 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.856361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.509484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.843525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.541499 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.858816 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.555316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.544949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.901047 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.680366 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.806028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.831161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.651299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.771015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.869237 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.580627 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.858305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.709080 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.901995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.702174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.772319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.837267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.588769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.548036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.607803 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.755397 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.797701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.668726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.580900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.587723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.569066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.700053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:28:21 CEST)" was missed by 0:00:04.669709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:10.003107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.895940 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.794937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.766376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.990738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.885292 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.954529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.542647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.656690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.848078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.688723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:10.003601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:10.006026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:10.048262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.827604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:10.016433 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.953229 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.735953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.978412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.702577 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.856284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:10.049216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.755019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.919542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.984450 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.798553 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.918231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.944900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.695264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.692217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.727863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:10.005520 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.849427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.902599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.734918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.815954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.728125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.716276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.816940 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:29:21 CEST)" was missed by 0:00:09.847273 - iteration 7700/ 159576 | consumed samples: 336240 | elapsed time per iteration (ms): 32809.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.926086E+00 | loss scale: 4096.0 | grad norm: 98196.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.442991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.313402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.550154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.501583 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.089719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.249604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.345563 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.342005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.550680 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.374629 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.491929 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.500278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.235786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.553101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.274921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.403348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.537835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.596260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.395161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.432368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.203837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.595383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.552579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.396409 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.302051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.466597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.531552 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.449657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.465303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.563534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.283041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.525509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.239286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.242326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.363029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.275183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.281969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.263323 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.363991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:34:21 CEST)" was missed by 0:00:08.394327 - iteration 7710/ 159576 | consumed samples: 337520 | elapsed time per iteration (ms): 32735.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.877395E+00 | loss scale: 4096.0 | grad norm: 138571.074 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.670922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.434210 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.563790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.658597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.210549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.515943 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.553144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.462821 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.466386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.622454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.370449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.324606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.612772 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.356618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.360071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.395733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.716169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.673386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.524153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.717061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.517222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.422873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.495476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.652355 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.586128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.403856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.646277 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.671502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.673933 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.684381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.363121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.402781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.484756 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.483859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.395982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.621302 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.587641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.570675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.384370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:39:21 CEST)" was missed by 0:00:03.515367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.400324 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.293167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.163647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.446358 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.192198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:05.939912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.253460 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.282519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.224818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.195759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.315432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.342073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.350461 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.125094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.246592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.152182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.316773 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.381683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.413691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.351822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.133189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.400871 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.403274 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.089445 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.402729 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.388063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.125275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.099831 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.054042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.214082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.132121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.445601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.245401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.086059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.213193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.092489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.375753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.244508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.113516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:40:21 CEST)" was missed by 0:00:06.300005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.663958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.427236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.605691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.614034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.556812 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.546135 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.488412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.580353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.579057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.710019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.508989 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.415805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.677299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.455849 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.203557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.517136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.651664 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.510186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.563411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.459423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.615443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.349607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.396848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.356083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.664533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.666884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.388910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.363448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.317654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.353099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.395719 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.709218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.666395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.477696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.645340 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.476788 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.639334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.377083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.388768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:41:21 CEST)" was missed by 0:00:09.508091 - iteration 7720/ 159576 | consumed samples: 338800 | elapsed time per iteration (ms): 32224.1 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.873489E+00 | loss scale: 4096.0 | grad norm: 251367.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -[2021-09-30 16:45:19] PULSE: tr8-104B is running for 12:53:12 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.456308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.349155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.502294 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.219586 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.302486 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.280751 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:04.995883 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.155747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.309438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.301267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.338488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.208122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.398059 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.406410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.141915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.248193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.456818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.181200 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.145384 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.501504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.458703 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.443987 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.269994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.437660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.371419 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.469676 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.407777 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.431604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.459236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.109956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.181046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.372735 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.251751 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.189193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.148432 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.269139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.188078 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.300439 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.355835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:47:21 CEST)" was missed by 0:00:05.169470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.892091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.060434 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.998573 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.732714 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.771848 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.092339 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.900281 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.034794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.929335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.871587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.963557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.028448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.962250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.988912 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.997240 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.779967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.022435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.047635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.050098 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.586734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.746592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.700759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.736261 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.049560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.891269 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.739306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.859983 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.778913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.946645 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.760283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.047748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.940580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:09.093728 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.811034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.799545 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.893922 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.772636 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.861411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.839639 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:48:21 CEST)" was missed by 0:00:08.843161 - iteration 7730/ 159576 | consumed samples: 340080 | elapsed time per iteration (ms): 32169.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.852581E+00 | loss scale: 4096.0 | grad norm: 124724.028 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7740/ 159576 | consumed samples: 341360 | elapsed time per iteration (ms): 32014.7 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.781658E+00 | loss scale: 4096.0 | grad norm: 146066.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.051711 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.002882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.112884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.873288 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.963136 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.097689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.156019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.956176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.954965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.992164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.861823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.934456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.091360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.110040 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.025075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.123325 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.061475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.795613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.110501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.649564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.834901 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.809468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.763649 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.834736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.155218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.112410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.923715 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.905485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.060215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.842880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.802096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.901921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.922807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.085329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.799131 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.841782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.026535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:06.009581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.823259 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:55:21 CEST)" was missed by 0:00:05.954279 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.451377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.522904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.512537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.049213 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.554845 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.362766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.497318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.555696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.355885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.354631 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.391807 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.261494 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.334082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.426053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.491015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.409063 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.509744 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.424743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.459732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.402567 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.461122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.195278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.201768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.484951 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.510140 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.209088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.163296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.198770 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.273002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.234398 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.512047 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.323389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.353759 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.305154 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.242525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.301604 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.322447 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.234607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.222755 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 16:56:21 CEST)" was missed by 0:00:10.241416 - iteration 7750/ 159576 | consumed samples: 342640 | elapsed time per iteration (ms): 32704.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.754932E+00 | loss scale: 4096.0 | grad norm: 118381.918 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.301502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.411515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.454644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.350376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.360050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.254825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.253603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.290817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.160436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.408690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.358737 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:03.948222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.261776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.396303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.233153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.222305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.323743 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.422003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.094266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.383950 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.409166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.133554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.062286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.097746 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.133361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.453834 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.325075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.390001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.308108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.204109 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.141528 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.100761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.200564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.221448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.108151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.171996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.140423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.411092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.121808 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:01:21 CEST)" was missed by 0:00:04.252804 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.979092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.921914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.075071 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.780873 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.945428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.029097 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.042347 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.980498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.714670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.031995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.074249 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.882219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.016747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.875271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.874064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.911255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.853560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.842758 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.010417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.928480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.824512 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.944169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.970827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.761893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.029599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.568641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.753984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.728534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.682721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.718184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.792400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.753790 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.031492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.873155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.721202 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.820985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.841894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:08.004391 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.742146 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:02:21 CEST)" was missed by 0:00:07.760845 - iteration 7760/ 159576 | consumed samples: 343920 | elapsed time per iteration (ms): 32295.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.762139E+00 | loss scale: 4096.0 | grad norm: 109957.436 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.144615 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.095977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.190572 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.990701 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.086291 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.157844 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.094648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.147469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.684111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.907903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.997687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.132235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.989560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.026755 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.896367 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.969034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.059661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.830188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.119859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.145069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.869484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.844044 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.798214 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.833647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.869284 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.189775 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.146971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.958251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.060990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.125919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.044023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.940038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:08.037498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.877440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.836708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.936494 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.957338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.876344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.857710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:08:21 CEST)" was missed by 0:00:07.988716 - iteration 7770/ 159576 | consumed samples: 345200 | elapsed time per iteration (ms): 32575.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.791389E+00 | loss scale: 4096.0 | grad norm: 217089.521 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.130608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.168217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.180683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.155849 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.183520 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.834191 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.025551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.096943 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.079966 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.073517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.132052 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.720212 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.225763 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.181153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.026855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.122416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.866285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.943976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.033793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.226734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.005153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.880125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.869734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.893679 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.905381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.062852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.932531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.024682 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.976090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.913543 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.972554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.183074 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.194043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.912446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.994390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.162086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:06.095811 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.993487 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.905630 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:15:21 CEST)" was missed by 0:00:05.872850 - iteration 7780/ 159576 | consumed samples: 346480 | elapsed time per iteration (ms): 32078.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.773003E+00 | loss scale: 4096.0 | grad norm: 116294.271 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.123137 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.065968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.148329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.826662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.218231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.160708 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.018022 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.089474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.072497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.173159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.124523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.173601 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.712661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.872566 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.219184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.114908 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.858766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.965008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.176088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.862223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.936440 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.019360 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.925004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.997656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.017163 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.968570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.906001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.886158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.897867 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.175534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.026313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.088275 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.898073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.055399 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.986880 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.154541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:10.186502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.865303 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.985937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:16:21 CEST)" was missed by 0:00:09.904966 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.612618 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.578946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.650271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.507556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.561992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.316222 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.614098 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.665569 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.555574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.202242 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.508855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.637918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.663192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.362155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.707802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.662754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.348304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.387584 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.387389 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.544888 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.414515 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.506690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.454608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.375692 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.426043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.515854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.708760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.487209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.458166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.577823 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.676070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.351783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.476411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.644113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.395590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.604477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.475513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.394491 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.665144 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:21:21 CEST)" was missed by 0:00:07.354874 - iteration 7790/ 159576 | consumed samples: 347760 | elapsed time per iteration (ms): 32923.3 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.746706E+00 | loss scale: 4096.0 | grad norm: 151509.203 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.727060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.798497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.832185 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.833560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.535721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.927276 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.869766 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.781537 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.775048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.674057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.857399 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.882656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.885083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.421733 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.581626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.928226 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.728413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.764380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.634019 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.706688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.882236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.677646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.567802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.645501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.606902 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.735348 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.726223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.797304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.895539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.615051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.607113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.571313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.595219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.884607 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.695927 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.863600 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.824011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.574344 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.695008 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:22:21 CEST)" was missed by 0:00:10.613991 - iteration 7800/ 159576 | consumed samples: 349040 | elapsed time per iteration (ms): 32167.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.868619E+00 | loss scale: 4096.0 | grad norm: 119301.958 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -saving checkpoint at iteration 7800 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -[2021-09-30 17:26:29,237] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints/global_step7800/mp_rank_00_model_states.pt - successfully saved checkpoint at iteration 7800 to /gpfsscratch/rech/six/commun/checkpoints/tr8-104B/checkpoints -time (ms) | save-checkpoint: 20696.63 - iteration 7810/ 159576 | consumed samples: 350320 | elapsed time per iteration (ms): 33106.8 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.930750E+00 | loss scale: 4096.0 | grad norm: 151825.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.853657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.710985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.816129 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.405603 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.912115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.712270 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.866138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.758976 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.817556 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.551683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.598925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.841352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.866608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.869014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.519694 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.911298 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.748283 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.782508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.765535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.781228 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.565576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.555207 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.629435 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.590826 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.868502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.618000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.690634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.679822 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.847478 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.661570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.879475 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.558221 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.658050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.710188 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.807900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.678942 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.591061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.579198 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.597920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:33:21 CEST)" was missed by 0:00:06.719343 - iteration 7820/ 159576 | consumed samples: 351600 | elapsed time per iteration (ms): 32852.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 7.076566E+00 | loss scale: 4096.0 | grad norm: 100169.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.247580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:06.837081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.142492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.190451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:06.983152 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.298077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.300476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.285203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.143784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.179780 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.297633 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.249051 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.030411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.272859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:06.997032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:06.951197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.060885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.022304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.343632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.122119 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.213990 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.278953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.197006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.093038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.212748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.089513 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.022507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.010638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.342802 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.299993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.049495 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.111304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.141638 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.311000 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:06.989747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.110443 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:06.986726 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.150817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.029400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:39:21 CEST)" was missed by 0:00:07.239456 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.767446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.872596 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.608111 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.462099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.910159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.815449 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.873971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.655364 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.897799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.923055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.925473 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.622021 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.576151 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.647251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.967742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.968610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.768789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.747057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.838974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.922642 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.837691 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.647467 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.685903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.674476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.903914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.766626 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.821992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.935934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.614685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.714522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.735386 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.611684 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.635635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.924965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.775765 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.804837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.736287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.718046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.864373 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:40:21 CEST)" was missed by 0:00:11.654344 - iteration 7830/ 159576 | consumed samples: 352880 | elapsed time per iteration (ms): 32756.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.954251E+00 | loss scale: 4096.0 | grad norm: 163671.949 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.055954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.103928 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.198659 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.057254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.211096 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.896657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.943876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.213981 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.750593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.910541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.864644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.974383 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.256253 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.257112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.035581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.161167 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.162536 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.186328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.211585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.935814 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.962964 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.024767 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.192446 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.006531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.126213 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.224455 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.002997 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.900214 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.213464 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.093333 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.152878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.903220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.935996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.942869 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.064306 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.127548 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.023916 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.110581 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:07.924215 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:45:21 CEST)" was missed by 0:00:08.055231 -[2021-09-30 17:45:31] PULSE: tr8-104B is running for 13:53:24 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) - iteration 7840/ 159576 | consumed samples: 354160 | elapsed time per iteration (ms): 32892.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.939928E+00 | loss scale: 4096.0 | grad norm: 92494.673 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.648090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.685739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.544308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.543061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.591018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.649558 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.383734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.237682 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.614481 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.698178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.430957 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.744176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.597517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.673418 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.397614 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.351760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.411150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.461481 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.422876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.679498 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.542159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.613272 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.711517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.490079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.698681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.743374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.551331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.450037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.522700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.493619 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.639921 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.390263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.423066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.700553 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.511900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.510968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.701115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.387305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.580412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:51:21 CEST)" was missed by 0:00:06.429977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.077084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.982369 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.934430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.089570 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.039587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.040944 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.775117 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.822329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.064774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.629060 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.788967 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.743123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.135544 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.935722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.090043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.852855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.134727 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.942688 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.841417 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.914061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.004656 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.881451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.814269 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.091910 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.903250 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.005977 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.070904 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.989002 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.884994 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.031304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.102946 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.781672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:10.092479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.814458 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.778666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.802622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.971791 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.933648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.902365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:52:21 CEST)" was missed by 0:00:09.821332 - iteration 7850/ 159576 | consumed samples: 355440 | elapsed time per iteration (ms): 32259.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.807732E+00 | loss scale: 4096.0 | grad norm: 87184.933 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.631856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.489156 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.537128 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.644322 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.594365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.595718 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.329898 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.377116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.183827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.690296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.490471 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.619551 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.644825 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.647226 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.343776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.297896 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.333385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.407610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.689482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.497461 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.439755 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.586042 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.657667 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.436217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.369192 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.369024 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.646689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.396183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.468835 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.559429 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.457101 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.458018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.560744 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.625700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.543776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.336442 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.376091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.526588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.357426 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:58:21 CEST)" was missed by 0:00:06.488436 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.375189 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.337641 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.280476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.232510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.387628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.339073 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.073225 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.362857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:09.927182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.041195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.432786 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.233853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.388156 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.390555 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.150956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.433672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.304037 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.287066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.183079 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.302769 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.120490 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.179540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.112540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.087105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.076742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.100670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.112381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.390030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.240799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.139568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.212168 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.201374 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.369032 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.231695 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.329405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.401026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.079758 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.200469 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.269894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 17:59:21 CEST)" was missed by 0:00:10.119433 - iteration 7860/ 159576 | consumed samples: 356720 | elapsed time per iteration (ms): 32264.2 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.810859E+00 | loss scale: 4096.0 | grad norm: 92624.863 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.596800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.454100 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.502062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.560585 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.655209 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.609231 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.342006 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.148777 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.262819 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.455404 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.590553 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.622547 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.294817 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.584482 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.609723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.612147 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.372516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.333925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.654410 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.361094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.524328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.401142 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.298327 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.611608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.462388 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.433747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.550986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.308742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.404704 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.301335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.422954 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.422045 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.334148 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.491496 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.341041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.567771 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.534139 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.517197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.330810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:04:21 CEST)" was missed by 0:00:03.461865 - iteration 7870/ 159576 | consumed samples: 358000 | elapsed time per iteration (ms): 32591.4 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.822350E+00 | loss scale: 4096.0 | grad norm: 86185.417 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.590378 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.288365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.256405 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.648789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.448993 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.447716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.354668 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.584159 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.552862 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.495689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.554205 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.335615 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.578064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.142353 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.327516 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.647982 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.602877 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.517897 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.616150 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.294899 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.603315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.605757 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.302308 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.291924 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.366186 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.605193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.455996 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.485067 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.416527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.519245 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.502286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.544579 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.394774 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.415597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.327716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.315886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.427388 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.398305 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.446936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:05:21 CEST)" was missed by 0:00:09.334648 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.405166 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.262483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.367654 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.310448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.368985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.103157 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.150376 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.420423 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:07.957116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.071161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.270685 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.463587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.263760 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.242038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.417632 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.430913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.392851 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.418094 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.106667 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.142267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.462784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.419969 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.299782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.169470 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.333995 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.213048 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.332702 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.359330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.109683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.142477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.117090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.180925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.149352 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.231318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.398965 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.317055 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.209529 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.230396 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.130674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:10:21 CEST)" was missed by 0:00:08.261685 - iteration 7880/ 159576 | consumed samples: 359280 | elapsed time per iteration (ms): 33220.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.873489E+00 | loss scale: 4096.0 | grad norm: 114955.323 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.658263 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.670690 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.620725 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.563525 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.356246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.673531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.523776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.716650 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.516840 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.515599 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.422534 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.495146 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.622102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.403505 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.645920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.671182 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.210230 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.395523 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.359768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.433975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.395377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.402380 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.715885 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.552886 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.587082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.652029 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.466103 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.585784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.612437 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.684028 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.362792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.462588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.370174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.324317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.383761 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.673081 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.484399 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.570169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.483509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:14:21 CEST)" was missed by 0:00:04.514796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.252267 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.310645 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.109594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.214747 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.157539 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.216077 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.950262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.267517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.989514 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.117787 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.110865 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.146861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.016526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.089155 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.264713 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.997502 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.239920 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.265196 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.804232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.918287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.028013 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.989402 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.996381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.309874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.078377 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.181116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.246034 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.060127 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.179810 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.206461 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.278038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.956806 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.056597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.964176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.953795 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.267088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.108792 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.164175 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:10.077517 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:15:21 CEST)" was missed by 0:00:09.977767 - iteration 7890/ 159576 | consumed samples: 360560 | elapsed time per iteration (ms): 33473.9 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.975810E+00 | loss scale: 4096.0 | grad norm: 108272.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.071326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.108903 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.167265 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.014175 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.124238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.037674 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.121378 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.036428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.096597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.121860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.134700 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.166530 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.123730 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.020773 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.063098 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.072787 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.102716 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:20:21 CEST)" was missed by 0:00:03.003602 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.687317 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.740161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.630181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.724974 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.783304 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.653662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.737363 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.652427 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.712622 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.737863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.619508 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.636739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.750694 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.679102 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.782550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.718702 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.688781 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.739741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.423041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.590531 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.583609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.276992 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.469116 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.561890 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.582411 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.470287 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.429535 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.462301 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.450472 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.462173 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.436934 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.426549 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.500797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.489311 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.532919 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.529394 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.551169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.581510 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.391123 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:21:21 CEST)" was missed by 0:00:06.550296 - iteration 7900/ 159576 | consumed samples: 361840 | elapsed time per iteration (ms): 32325.4 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.944908E+00 | loss scale: 4096.0 | grad norm: 118352.461 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.237789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.143061 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.935801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.253087 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.296181 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.096390 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.095149 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.074675 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.250255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.200285 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.983030 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.789749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.981914 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.103338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.132362 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.002053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.165316 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.201628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.942334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.225489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.250734 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.975076 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.949682 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.903854 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.939326 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.013552 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.974926 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.295416 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.252608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.063918 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.166647 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.231575 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.045677 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.191984 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.263588 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.042161 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.149694 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:11.963299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.094331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:22:21 CEST)" was missed by 0:00:12.063066 - iteration 7910/ 159576 | consumed samples: 363120 | elapsed time per iteration (ms): 32635.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.895307E+00 | loss scale: 4096.0 | grad norm: 146408.442 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.353297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.153488 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.200203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.258732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.310218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.294964 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.152271 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.131796 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.307365 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.320644 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:04.992941 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.040145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:04.846876 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.288670 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.222403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.307853 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.032220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:04.960979 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:04.996431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.070689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.032045 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.352568 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.189541 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.059207 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.121026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.249108 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.257522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:04.999453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.099278 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.282660 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.006818 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.039120 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.309742 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.160521 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.102837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.120145 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.223891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.206925 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.020546 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:27:21 CEST)" was missed by 0:00:05.151578 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.629086 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.429291 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.596425 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.570783 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.428113 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.335004 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.407609 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.583169 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.498235 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.533206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.476011 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.534566 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.268782 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.315972 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.583657 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.586052 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.122673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.307986 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.236794 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.346476 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.628361 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.585540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.465332 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.396838 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.499545 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.564489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.378594 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.524900 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.275255 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.375069 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.395953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.558504 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.282634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.272268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.296212 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.307878 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.314908 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.436350 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.427223 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:28:21 CEST)" was missed by 0:00:10.482622 - iteration 7920/ 159576 | consumed samples: 364400 | elapsed time per iteration (ms): 32264.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.883556E+00 | loss scale: 4096.0 | grad norm: 122577.473 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.491557 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.360635 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.338451 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.397003 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.131162 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.433216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.291777 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.290542 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.270027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.445610 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.458872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.178400 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.448474 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.099218 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.208917 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.170296 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.177295 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.490811 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.327776 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.197480 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.241050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.387334 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.137689 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.237520 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.446125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.134698 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.448026 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.259341 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.426991 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.258420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.420936 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.170522 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.145085 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.395824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.298811 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.362171 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.345224 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.158894 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:34:21 CEST)" was missed by 0:00:03.289875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.691335 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.633001 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.538251 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.596800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.648265 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.491566 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.490330 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.527500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.469827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.645438 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.560448 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.587115 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.658710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.595500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.378193 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.645889 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.298988 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.377083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.690580 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.397260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.331015 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.337484 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.620705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.370286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.334497 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.408732 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.370122 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.459112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.561805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.544842 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.440856 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.437329 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.458197 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.344852 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.647805 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.498590 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.626787 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.489485 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.185016 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:35:21 CEST)" was missed by 0:00:07.358491 - iteration 7930/ 159576 | consumed samples: 365680 | elapsed time per iteration (ms): 32012.4 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.774232E+00 | loss scale: 4096.0 | grad norm: 132241.887 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7940/ 159576 | consumed samples: 366960 | elapsed time per iteration (ms): 31069.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.763776E+00 | loss scale: 4096.0 | grad norm: 212223.272 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.748492 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.605789 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.653768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.300463 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.414489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.760956 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.446530 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.493753 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.763850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.485628 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.676041 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.711036 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.736196 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.761441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.524273 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.607176 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.774241 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.712399 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.806121 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.807014 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.585479 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.677368 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.742319 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.556403 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.453058 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.552865 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.460420 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.450066 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.492709 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.763340 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.643174 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.660430 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.702721 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.573799 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.485895 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.512909 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.605057 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.474084 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.614216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:45:21 CEST)" was missed by 0:00:03.574741 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.987070 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.949477 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.892313 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.653038 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.844385 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.915800 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.898837 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.999518 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.914547 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:05.012738 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.685092 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.732260 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.999971 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.539023 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.724153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:05.044678 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:05.045526 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.845730 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.881661 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.823981 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.843486 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.941234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.950951 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.974784 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:05.002399 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.698915 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.688576 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.712509 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.762824 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.731199 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:05.001882 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.852662 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.751413 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.980850 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.794950 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.691598 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.791412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.812286 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.724431 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:46:21 CEST)" was missed by 0:00:04.813256 -[2021-09-30 18:45:42] PULSE: tr8-104B is running for 14:53:35 since 2021-09-30T03:52:07 (1289770 on 'gpu_p13' partition (r6i4n[5-6,8],r6i5n[4-5],r7i0n[5-8],r7i1n0,r8i2n8,r8i4n1,r8i7n[3-8],r9i0n[0-8],r9i1n[0-8],r9i2n[3-8],r9i3n[7-8],r9i4n[0-2],r9i5n[2,5-7],r9i6n[2-8],r14i7n[1-6]) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.745560 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.392246 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.840338 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.697665 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.769075 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.752112 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.852748 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.802768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.538347 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.644606 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.853206 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.855655 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.506355 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.565736 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.855090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.699005 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.734945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.677238 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.696749 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.648184 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.767855 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.866083 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.804219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.585587 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.828043 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.552183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.541829 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.616064 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.577454 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.584483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.897958 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.898827 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.604681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.834125 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.794540 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.544872 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.665583 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.577681 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.705962 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:47:21 CEST)" was missed by 0:00:07.666492 - iteration 7950/ 159576 | consumed samples: 368240 | elapsed time per iteration (ms): 31297.4 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.795293E+00 | loss scale: 4096.0 | grad norm: 120430.265 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.927797 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.833018 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.593773 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.785090 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.940672 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.479722 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.940219 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.953489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.890254 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.625801 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.664879 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.986232 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.856595 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.732088 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.915489 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.943138 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.639625 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.703519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.942554 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.786441 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.764696 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.839613 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.735663 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.855315 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.891666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.673046 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.629299 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.671953 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.985419 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.822412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.692153 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.921564 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.784262 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.881980 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.632331 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.653268 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.793428 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.753945 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.753027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:48:21 CEST)" was missed by 0:00:10.665147 - iteration 7960/ 159576 | consumed samples: 369520 | elapsed time per iteration (ms): 31049.5 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.808997E+00 | loss scale: 4096.0 | grad norm: 91332.967 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) - iteration 7970/ 159576 | consumed samples: 370800 | elapsed time per iteration (ms): 31125.6 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.796875E+00 | loss scale: 4096.0 | grad norm: 99831.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms) -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.213501 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.260234 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.318768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.052893 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.354985 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.413349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.212328 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.119183 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.191863 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.367447 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.282412 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.309031 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.380624 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.092082 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.059395 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.370266 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.092264 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.021062 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.130683 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.099114 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.181065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.348739 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.317533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.100203 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.159318 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.180178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.367937 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.056507 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.220533 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.249593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.283828 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.162861 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.342710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.066874 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.369839 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.266875 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.412666 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.080571 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 18:59:21 CEST)" was missed by 0:00:03.211527 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.737217 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.594591 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.664687 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.642519 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.701074 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.435195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.403240 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.795673 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.595859 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.666065 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.749720 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.762939 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.699777 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.724913 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.750164 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.512949 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.474381 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.794857 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.501545 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.574178 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.731027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.649107 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.545091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.691370 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.482483 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.441705 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.541597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.752574 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.449130 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.438793 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.481414 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.752105 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.602813 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.631860 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.563401 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.593723 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.562500 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.474593 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.462768 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:00:21 CEST)" was missed by 0:00:05.289349 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.453724 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.359027 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.417592 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.151710 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.512158 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.312342 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.311099 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.348357 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.218050 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.290665 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.466220 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.381216 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.407884 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.479468 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.416289 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.198968 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.158195 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.258091 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.441453 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.466694 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.469053 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.005754 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.165623 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.119815 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.229465 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.190897 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.197899 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.511407 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.279891 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.382597 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.447550 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.365634 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.261608 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.278975 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.191068 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.155297 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.468646 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.319355 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.310236 -WARNING:apscheduler.executors.default:Run time of job "BaseEmissionsTracker._measure_power (trigger: interval[0:01:00], next run at: 2021-09-30 19:01:21 CEST)" was missed by 0:00:08.179322 - iteration 7980/ 159576 | consumed samples: 372080 | elapsed time per iteration (ms): 31471.0 | learning rate: 6.000E-05 | global batch size: 128 | lm loss: 6.763995E+00 | loss scale: 4096.0 | grad norm: 142552.546 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | -time (ms)